.. _api: Python API ========== .. note:: This section is currently incomplete. We're working to fill out the details of the Python API as soon as possible. Configuration ------------- The ``immunedb.common.config`` module provides methods to initialize a connection to a new or existing database. Most programs using ImmuneDB will start with code similar to: .. code-block:: python import immunedb.common.config as config parser = config.get_base_arg_parser('Some description of the program') # ... add any additional arguments to the parser ... args = parser.parse_args() session = config.init_db(args.db_config) When this script is run, it will require at least one argument which is the path to a database configuration (as generated with ``immunedb_admin``). Using that, a ``Session`` object will be made, connected to the associated database. One can also directly specify the path to a configuration directly. .. code-block:: python import immunedb.common.config as config session = config.init_db('path/to/config') Alternatively a dictionary with the same information can be passed: .. code-block:: python import immunedb.common.config as config session = config.init_db({ 'host': '...', 'database': '...', 'username': '...', 'password': '...', }) Returned will be a ``Session`` object which can be used to interact with the database. Using the Session ----------------- ImmuneDB is built using `SQLAlchemy `_ as a MySQL abstraction layer. Simply put, instead of writing SQL, the database is queried using Python constructs. Full documentation on using the session can be found in `SQLAlchemy's documentation `_. Once a session is created, the models listed below can be queried. Example Queries --------------- Below are some example queries that demonstrate how to use the ImmuneDB API. Clone CDR3s ^^^^^^^^^^^ Get all clones with a given V-gene and print their CDR3 AA sequences. **Input** .. code-block:: python import immunedb.common.config as config from immunedb.common.models import Clone session = config.init_db(...) for clone in session.query(Clone).filter(Clone.v_gene == 'IGHV3-30'): print('clone {} has AAs {}'.format(clone.id, clone.cdr3_aa)) **Output** .. code-block:: bash clone 37884 has AAs CARGYSSSYFDYW clone 37886 has AAs CARSRTSLSIYGVVPTGDFDSW clone 37885 has AAs CARNGLNTVSGVVISPKYWLDPW clone 37887 has AAs CARDLFRGVDFYYYGMDVW Clone Frequency ^^^^^^^^^^^^^^^ Determine how many sequences appear in each sample belonging to clone 1234. Note the ``CloneStats`` model has one entry for each clone/sample combination plus one where the ``sample_id`` field is ``null`` which represents the overall clone. **Input** .. code-block:: python import immunedb.common.config as config from immunedb.common.models import CloneStats session = config.init_db(...) for stat in session.query(CloneStats).filter( CloneStats.clone_id == 1234).order_by(CloneStats.sample_id): print('clone {} has {} unique sequences and {} copies {}'.format( stat.clone_id, stat.unique_cnt, stat.total_cnt, ('in sample ' + stat.sample.name) if stat.sample else 'overall')) **Output** .. code-block:: bash clone 1234 has 53 unique sequences and 1331 copies overall clone 1234 has 27 unique sequences and 379 copies in sample sample1 clone 1234 has 27 unique sequences and 339 copies in sample sample3 clone 1234 has 24 unique sequences and 311 copies in sample sample4 clone 1234 has 28 unique sequences and 302 copies in sample sample10 V-gene Usage ^^^^^^^^^^^^ This is a more complex query which gathers the V-gene usage of all sequences which are (a) in subject with ID 1, (b) associated with a clone, and (c) are unique to the subject, printing them from least to most frequent. **Input** .. code-block:: python import immunedb.common.config as config from immunedb.common.models import Sequence, SequenceCollapse session = config.init_db(...) subject_unique_seqs = session.query( func.count(Sequence.seq_id).label('count'), Sequence.v_gene ).join( SequenceCollapse ).filter( Sequence.subject_id == 1, ~Sequence.clone_id.is_(None), SequenceCollapse.copy_number_in_subject > 0 ).group_by( Sequence.v_gene ).order_by( 'count' ) for seq in subject_unique_seqs: print(seq.v_gene, seq.count) **Output** .. code-block:: bash # ... output truncated ... IGHV4-34 1128 IGHV1-2 1160 IGHV3-48 1169 IGHV4-39 1310 IGHV3-7 1345 IGHV3-30|3-30-5|3-33 1607 IGHV3-23|3-23D 1626 IGHV3-21 1878 .. _models: Data Models ----------- .. automodule:: immunedb.common.models :members: :exclude-members: relationship