PSMILES - Fun with P🙂s strings
The psmiles
Python package provides tools and functions for polymer SMILES (PSMILES or P🙂s) strings. PSMILES strings are a chemical language to represent polymers.
What is a PSMILES string?
PSMILES strings are string representations of polymer chemical structures. PSMILES strings are very useful for data-driven polymer discovery, design or prediction task.
A PSMILES string follows the daylight SMILES syntax defined at OpenSmiles, but has two stars ([*]
or *
) that indicate the two endpoints of the polymer repeat unit. See PSMILES guide for more details.
Example P🙂s:
Polyethylene | Polyethylene oxide | Polypropylene |
---|---|---|
[*]CC[*] |
[*]CCO[*] |
[*]CC([*])C |
Tip
Create these figures using psmiles
1 2 3 4 |
|
Features
- Polymer Fingerprints Numerical representations of polymers that measure polymer similarity. They can be used for any polymer informatics task that requires numerical representations of polymers such as property predictions, polymer structure predictions (design tasks), ML-based synthesis assistants, etc.
psmiles
offers polyBERT, Circular (Morgen), Mordred, and RDKit fingerprints. - Canonicalize PSMILES Find a unique representation of the PSMILES string. Useful for many informatics tasks.
- Dimerize PSMILES Get the dimerized PSMILES string
- More Radomize, compute polymer similarity, alternating copolymers, save chemical drawings