RDKIT2FPS(1) User Commands RDKIT2FPS(1)
NAME
rdkit2fps - rdkit2fps
DESCRIPTION
usage: rdkit2fps [-h] [--RDK] [--fpSize INT] [--minPath INT] [--maxPath INT]
[--nBitsPerHash INT] [--useHs USEHS] [--maccs166] [--substruct] [--rdmaccs] [--id-tag NAME] [--in FORMAT] [-o FILENAME] [--errors
{strict,report,ignore}] [filenames [filenames ...]]
Generate FPS fingerprints from a structure file using RDKit
positional arguments:
filenames
input structure files (default is stdin)
optional arguments:
-h, --help
show this help message and exit
--id-tag NAME
tag name containing the record id (SD files only)
--in FORMAT
input structure format (default guesses from filename)
-o FILENAME, --output FILENAME
save the fingerprints to FILENAME (default=stdout)
--errors {strict,report,ignore}
how should structure parse errors be handled? (default=strict)
RDKit topological fingerprints:
--RDK generate RDK fingerprints (default)
--fpSize INT
number of bits in the fingerprint (default=2048)
--minPath INT
minimum number of bonds to include in the subgraphs (default=1)
--maxPath INT
maximum number of bonds to include in the subgraphs (default=7)
--nBitsPerHash INT
number of bits to set per path (default=4)
--useHs USEHS
information about the number of hydrogens on each atom
166 bit MACCS substructure keys:
--maccs166
generate MACCS fingerprints
881 bit substructure keys:
--substruct
generate ChemFP substructure fingerprints
ChemFP version of the 166 bit RDKit/MACCS keys:
--rdmaccs
generate 166 bit RDKit/MACCS fingerprints
This program guesses the input structure format based on the filename extension. If the data comes from stdin, or the extension name us
unknown, then use "--in" to change the default input format. The supported format extensions are:
File Type
Valid FORMATs (use gz if compressed)
--------- ------------------------------------
SMILES
smi, ism, can, smi.gz, ism.gz, can.gz
SDF sdf, mol, sd, mdl, sdf.gz, mol.gz, sd.gz, mdl.gz
rdkit2fps 1.0 June 2012 RDKIT2FPS(1)