Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

sdf2fps(1) [debian man page]

SDF2FPS(1)							   User Commands							SDF2FPS(1)

NAME
sdf2fps - sdf2fps DESCRIPTION
usage: sdf2fps [-h] [--id-tag TAG] [--fp-tag TAG] [--num-bits INT] [--errors {strict,report,ignore}] [-o FILENAME] [--software TEXT] [--type TEXT] [--decompress METHOD] [--binary] [--binary-msb] [--hex] [--hex-lsb] [--hex-msb] [--base64] [--cactvs] [--decoder DECODER] [--pubchem] [filenames [filenames ...]] Extract a fingerprint tag from an SD file and generate FPS fingerprints positional arguments: filenames input SD files (default is stdin) optional arguments: -h, --help show this help message and exit --id-tag TAG get the record id from TAG instead of the first line of the record --fp-tag TAG get the fingerprint from tag TAG (required) --num-bits INT use the first INT bits of the input. Use only when the last 1-7 bits of the last byte are not part of the fingerprint. Unexpected errors will occur if these bits are not all zero. --errors {strict,report,ignore} how should structure parse errors be handled? (default=strict) -o FILENAME, --output FILENAME save the fingerprints to FILENAME (default=stdout) --software TEXT use TEXT as the software description --type TEXT use TEXT as the fingerprint type description --decompress METHOD use METHOD to decompress the input (default='auto', 'none', 'gzip', 'bzip2') Fingerprint decoding options: --binary Encoded with the characters '0' and '1'. Bit #0 comes first. Example: 00100000 encodes the value 4 --binary-msb Encoded with the characters '0' and '1'. Bit #0 comes last. Example: 00000100 encodes the value 4 --hex Hex encoded. Bit #0 is the first bit (1<<0) of the first byte. Example: 01f2 encodes the value x01xf2 = 498 --hex-lsb Hex encoded. Bit #0 is the eigth bit (1<<7) of the first byte. Example: 804f encodes the value x01xf2 = 498 --hex-msb Hex encoded. Bit #0 is the first bit (1<<0) of the last byte. Example: f201 encodes the value x01xf2 = 498 --base64 Base-64 encoded. Bit #0 is first bit (1<<0) of first byte. Example: AfI= encodes value x01xf2 = 498 --cactvs CACTVS encoding, based on base64 and includes a version and bit length --decoder DECODER import and use the DECODER function to decode the fingerprint shortcuts: --pubchem decode CACTVS substructure keys used in PubChem. Same as --software=CACTVS/unknown --type 'CACTVSE_SCREEN/1.0 extended=2' --fptag=PUBCHEM_CACTVS_SUBSKEYS --cactvs sdf2fps 1.0 June 2012 SDF2FPS(1)

Check Out this Related Man Page

RDKIT2FPS(1)							   User Commands						      RDKIT2FPS(1)

NAME
rdkit2fps - rdkit2fps DESCRIPTION
usage: rdkit2fps [-h] [--RDK] [--fpSize INT] [--minPath INT] [--maxPath INT] [--nBitsPerHash INT] [--useHs USEHS] [--maccs166] [--substruct] [--rdmaccs] [--id-tag NAME] [--in FORMAT] [-o FILENAME] [--errors {strict,report,ignore}] [filenames [filenames ...]] Generate FPS fingerprints from a structure file using RDKit positional arguments: filenames input structure files (default is stdin) optional arguments: -h, --help show this help message and exit --id-tag NAME tag name containing the record id (SD files only) --in FORMAT input structure format (default guesses from filename) -o FILENAME, --output FILENAME save the fingerprints to FILENAME (default=stdout) --errors {strict,report,ignore} how should structure parse errors be handled? (default=strict) RDKit topological fingerprints: --RDK generate RDK fingerprints (default) --fpSize INT number of bits in the fingerprint (default=2048) --minPath INT minimum number of bonds to include in the subgraphs (default=1) --maxPath INT maximum number of bonds to include in the subgraphs (default=7) --nBitsPerHash INT number of bits to set per path (default=4) --useHs USEHS information about the number of hydrogens on each atom 166 bit MACCS substructure keys: --maccs166 generate MACCS fingerprints 881 bit substructure keys: --substruct generate ChemFP substructure fingerprints ChemFP version of the 166 bit RDKit/MACCS keys: --rdmaccs generate 166 bit RDKit/MACCS fingerprints This program guesses the input structure format based on the filename extension. If the data comes from stdin, or the extension name us unknown, then use "--in" to change the default input format. The supported format extensions are: File Type Valid FORMATs (use gz if compressed) --------- ------------------------------------ SMILES smi, ism, can, smi.gz, ism.gz, can.gz SDF sdf, mol, sd, mdl, sdf.gz, mol.gz, sd.gz, mdl.gz rdkit2fps 1.0 June 2012 RDKIT2FPS(1)
Man Page