Support for double-cut linkers with 0 atoms #17

Open
opened 2025-10-14 17:00:35 -06:00 by navan · 0 comments
Owner

Originally created by @baoilleach on 10/17/2024

I would like these two molecules to be identified as double-cut matched pairs (e.g. with exocyclic fragmentation):
image

The larger molecule has [*:1]C[*:2] as the linker, which is fine:

$ mmpdb smifrag "c1ccccc1OC2CCCNC2" --cut-smarts "exocyclic"
                   |------------  variable  ------------|       |-----------------------  constant  ---------------------
#cuts | enum.label | #heavies | symm.class | smiles     | order | #heavies | symm.class | smiles              | with-H
------+------------+----------+------------+------------+-------+----------+------------+---------------------+----------
  1   |     N      |    6     |      1     | *C1CCCNC1  |    0  |     7    |      1     | *Oc1ccccc1          | Oc1ccccc1
  1   |     N      |    7     |      1     | *Oc1ccccc1 |    0  |     6    |      1     | *C1CCCNC1           | C1CCNCC1
  2   |     N      |    1     |     11     | *O*        |   01  |    12    |     12     | *C1CCCNC1.*c1ccccc1 | -
  1   |     N      |    7     |      1     | *OC1CCCNC1 |    0  |     6    |      1     | *c1ccccc1           | c1ccccc1
  1   |     N      |    6     |      1     | *c1ccccc1  |    0  |     7    |      1     | *OC1CCCNC1          | OC1CCCNC1

The smaller molecule should have [*:1][*:2] as the linker, but nothing is found:

$ mmpdb smifrag "c1ccccc1C2CCCNC2" --cut-smarts "exocyclic"
                   |------------  variable  -----------|       |-----------------  constant  ----------------
#cuts | enum.label | #heavies | symm.class | smiles    | order | #heavies | symm.class | smiles    | with-H
------+------------+----------+------------+-----------+-------+----------+------------+-----------+---------
  1   |     N      |    6     |     1      | *C1CCCNC1 |   0   |    6     |     1      | *c1ccccc1 | c1ccccc1
  1   |     N      |    6     |     1      | *c1ccccc1 |   0   |    6     |     1      | *C1CCCNC1 | C1CCNCC1

I'm hoping there's a secret minsize somewhere I can change from 1 to 0, or is support for this likely to require much greater changes?

*Originally created by @baoilleach on 10/17/2024* I would like these two molecules to be identified as double-cut matched pairs (e.g. with exocyclic fragmentation): ![image](https://github.com/user-attachments/assets/56927f50-671a-4189-908b-63c2a63bc7f8) The larger molecule has ```[*:1]C[*:2]``` as the linker, which is fine: ``` $ mmpdb smifrag "c1ccccc1OC2CCCNC2" --cut-smarts "exocyclic" |------------ variable ------------| |----------------------- constant --------------------- #cuts | enum.label | #heavies | symm.class | smiles | order | #heavies | symm.class | smiles | with-H ------+------------+----------+------------+------------+-------+----------+------------+---------------------+---------- 1 | N | 6 | 1 | *C1CCCNC1 | 0 | 7 | 1 | *Oc1ccccc1 | Oc1ccccc1 1 | N | 7 | 1 | *Oc1ccccc1 | 0 | 6 | 1 | *C1CCCNC1 | C1CCNCC1 2 | N | 1 | 11 | *O* | 01 | 12 | 12 | *C1CCCNC1.*c1ccccc1 | - 1 | N | 7 | 1 | *OC1CCCNC1 | 0 | 6 | 1 | *c1ccccc1 | c1ccccc1 1 | N | 6 | 1 | *c1ccccc1 | 0 | 7 | 1 | *OC1CCCNC1 | OC1CCCNC1 ``` The smaller molecule should have ```[*:1][*:2]``` as the linker, but nothing is found: ``` $ mmpdb smifrag "c1ccccc1C2CCCNC2" --cut-smarts "exocyclic" |------------ variable -----------| |----------------- constant ---------------- #cuts | enum.label | #heavies | symm.class | smiles | order | #heavies | symm.class | smiles | with-H ------+------------+----------+------------+-----------+-------+----------+------------+-----------+--------- 1 | N | 6 | 1 | *C1CCCNC1 | 0 | 6 | 1 | *c1ccccc1 | c1ccccc1 1 | N | 6 | 1 | *c1ccccc1 | 0 | 6 | 1 | *C1CCCNC1 | C1CCNCC1 ``` I'm hoping there's a secret minsize somewhere I can change from 1 to 0, or is support for this likely to require much greater changes?
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: github/mmpdb#17
No description provided.