question about details and hyperparameter settings #9

Open
opened 2025-10-14 16:10:54 -06:00 by navan · 0 comments
Owner

Originally created by @unclestrong on 3/3/2022

Hi shion_honda:

I'm a student studying in bioinformataics, I found your research about SMILES Transformer is very very powerful!
Your paper give a hint that I can use your pretrained model or create my own pretrained model in my fied.

Now I can use your pretrained model provided in google drive to reach a impressive accuracy.And I want to creat my own pretrained model through your project on other large scale dataset.

Then I stuck in reproduce your pretrained model.I can run all your project perfectly,But I can't reach a accuracy level like you provided model.

So I write this email to ask you some details and hyperparameter settings.If you are convinient,would please answer my few questions?

1.Do you use the half data of chembl24? You mention in your paper that you sampled 861000.
2.How many epoch do you trained? You wrote 5 in your github project,but you provided the model named trfm_12_23000,which means at least 12.
3.What's the number of batch_size? You wrote 8 in your github project,but it seems much too smaller to 861000.

In all,I can't trained a model which is powerful as yours,even I use the same data. There must be some details I ignored.

I really like your paper,project and your open-source spirit.But I am really confused, there must be some details I ignored.
So if you are convenient,Could please give me some clue or answer,it's really really helpful.

Thank you sincerely!

*Originally created by @unclestrong on 3/3/2022* Hi shion_honda: I'm a student studying in bioinformataics, I found your research about SMILES Transformer is very very powerful! Your paper give a hint that I can use your pretrained model or create my own pretrained model in my fied. Now I can use your pretrained model provided in google drive to reach a impressive accuracy.And I want to creat my own pretrained model through your project on other large scale dataset. Then I stuck in reproduce your pretrained model.I can run all your project perfectly,But I can't reach a accuracy level like you provided model. So I write this email to ask you some details and hyperparameter settings.If you are convinient,would please answer my few questions? 1.Do you use the half data of chembl24? You mention in your paper that you sampled 861000. 2.How many epoch do you trained? You wrote 5 in your github project,but you provided the model named trfm_12_23000,which means at least 12. 3.What's the number of batch_size? You wrote 8 in your github project,but it seems much too smaller to 861000. In all,I can't trained a model which is powerful as yours,even I use the same data. There must be some details I ignored. I really like your paper,project and your open-source spirit.But I am really confused, there must be some details I ignored. So if you are convenient,Could please give me some clue or answer,it's really really helpful. Thank you sincerely!
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github/smiles-transformer#9
No description provided.