astrodb module¶
Main database handler code
- class astrodbkit.astrodb.AstrodbQuery(entities: _ColumnsClauseArgument[Any] | Sequence[_ColumnsClauseArgument[Any]], session: Session | None = None)[source]¶
Bases:
QuerySubclassing the Query class to add more functionality. See: https://stackoverflow.com/questions/15936111/sqlalchemy-can-you-add-custom-methods-to-the-query-object
Construct a
_query.Querydirectly.E.g.:
q = Query([User, Address], session=some_session)
The above is equivalent to:
q = some_session.query(User, Address)
- Parameters:
entities – a sequence of entities and/or SQL expressions.
session – a
Sessionwith which the_query.Querywill be associated. Optional; a_query.Querycan be associated with aSessiongeneratively via the_query.Query.with_session()method as well.
See also
Session.query()_query.Query.with_session()- astropy(spectra=None, spectra_format=None, **kwargs)[source]¶
Allow SQLAlchemy query output to be formatted as an astropy Table
- Parameters:
- spectrastr or list
List of columns to process as spectra
- spectra_formatstr
Format to apply for all spectra. Default: None means specutils will attempt to find the best one.
- Returns:
- tastropy.Table
Table output of query
- pandas(spectra=None, spectra_format=None, **kwargs)[source]¶
Allow SQLAlchemy query output to be formatted as a pandas DataFrame
- Parameters:
- spectrastr or list
List of columns to process as spectra
- spectra_formatstr
Format to apply for all spectra. Default: None means specutils will attempt to find the best one.
- Returns:
- dfpandas.DataFrame
DataFrame output of query
- class astrodbkit.astrodb.Database(connection_string, reference_tables=['Publications', 'Telescopes', 'Instruments', 'Modes', 'Filters', 'PhotometryFilters', 'Citations', 'References', 'Versions', 'Parameters', 'Regimes'], primary_table='Sources', primary_table_key='source', foreign_key='source', column_type_overrides={}, sqlite_foreign=True, connection_arguments={}, schema=None)[source]¶
Bases:
objectDatabase handler class
Wrapper for database calls and utility functions
- Parameters:
- connection_stringstr
Connection string to establish a database connection
- reference_tableslist
List of reference tables; these are treated separately from data tables. Default: [‘Publications’, ‘Telescopes’, ‘Instruments’]
- primary_tablestr
Name of the primary source table. Default: Sources
- primary_table_keystr
Name of the primary key in the sources table. This is meant to be unique and used to join tables. Default: source
- foreign_keystr
Name of the foreign key in other tables that refer back to the primary table. Default: source
- column_type_overridesdict
Dictionary with table.column type overrides. For example, {‘spectra.spectrum’: sqlalchemy.types.TEXT()} will set the table spectra, column spectrum to be of type TEXT()
- sqlite_foreignbool
Flag to enable/disable use of foreign keys with SQLite. Default: True
- connection_argumentsdict
Additional connection arguments, like {‘check_same_thread’: False}. Default: {}
- schemastr
Helper for setting default PostgreSQL schema. Equivalent to connection_arguments={“options”: f”-csearch_path={schema}”}
- add_table_data(data, table, fmt='csv')[source]¶
Method to insert data into the database. Column names in the file must match those of the database table. Additional columns in the supplied table are ignored. Format options include:
csv
astropy
pandas
- Parameters:
- datastr or astropy.Table or pandas.DataFrame
Name of file or Table or DataFrame to load
- tablestr
Name of table to insert records into
- fmtstr
Data format. Default: csv
- inventory(name, pretty_print=False)[source]¶
Method to return a dictionary of all information for a given source, matched by name. Each table is a key of this dictionary.
- Parameters:
- namestr
Name of the source to search for
- pretty_printbool
Optionally print out the dictionary contents on screen. Default: False
- Returns:
- data_dictdict
Dictionary of all information for the given source.
- load_database(directory: str, verbose: bool = False, reference_directory: str = 'reference', source_directory: str = 'source')[source]¶
Reload entire database from a directory of JSON files. Note that this will first clear existing tables.
- Parameters:
- directorystr
Name of top-level directory containing the JSON files
- verbosebool
Flag to enable diagnostic messages
- reference_directorystr
Relative path to sub-directory to use for reference JSON files (eg, data/reference)
- source_directorystr
Relative path to sub-directory to use for source JSON files (eg, data/source)
- load_json(filename)[source]¶
Load single source JSON into the database
- Parameters:
- filenamestr
Name of directory containing the JSON file
- load_table(table, directory, verbose=False)[source]¶
Load a reference table to the database, expects there to be a file of the form [table].json
- Parameters:
- tablestr
Name of table to load. Table must already exist in the schema.
- directorystr
Name of directory containing the JSON file
- verbosebool
Flag to enable diagnostic messages
- query_region(target_coords, radius=<Quantity 10. arcsec>, output_table=None, fmt='table', coordinate_table=None, ra_col='ra', dec_col='dec', frame='icrs', unit='deg')[source]¶
Perform a cone search of the given coordinates and return the specified output table.
- Parameters:
- target_coordsSkyCoord
Astropy SkyCoord object of coordinates to search around
- radiusQuantity or float
Radius as an astropy Quantity object in which to search for objects. If not a Quantity will convert to one assuming units are arcseconds. Default: 10 arcseconds
- output_tablestr
Name of table to match. Default: primary table (eg, Sources)
- fmtstr
Format to return results in (pandas, astropy/table, default). Default is astropy table
- coordinate_tablestr
Table to use for coordinates. Default: primary table (eg, Sources)
- ra_colstr
Name of column to use for RA values. Default: ra
- dec_colstr
Name of column to use for Dec values. Default: dec
- framestr
Coordinate frame for objects in the database. Default: icrs
- unitstr or tuple of Unit or str
Unit of ra/dec (or equivalent) in database. Default: deg
- Returns:
- List of SQLAlchemy results
- save_database(directory: str, clear_first: bool = True, reference_directory: str = 'reference', source_directory: str = 'source')[source]¶
Output contents of the database into the specified directory as JSON files. Source objects have individual JSON files with all data for that object. Reference tables have a single JSON for all contents in the table.
- Parameters:
- directorystr
Name of top-level directory in which to save the output JSON
- clear_firstbool
First clear the directory of all existing JSON (useful to capture DB deletions). Default: True
- reference_directorystr
Name of sub-directory to use for reference JSON files (eg, data/reference)
- source_directorystr
Name of sub-directory to use for source JSON files (eg, data/source)
- save_json(name, directory)[source]¶
Output database contents as JSON data for matched source into specified directory
- Parameters:
- namestr
Name of source to match by primary key. Alternatively can also be a row from a query against the source table.
- directorystr
Name of directory in which to save the output JSON
- save_reference_table(table: str, directory: str, reference_directory: str = 'reference')[source]¶
Save the reference table to disk
- Parameters:
- tablestr
Name of reference table to output
- directorystr
Name of directory in which to save the output JSON
- reference_directorystr
Name of sub-directory to use for reference JSON files (eg, data/reference)
- search_object(name, output_table=None, resolve_simbad=False, table_names={'Names': ['other_name'], 'Sources': ['source', 'shortname']}, fmt='table', fuzzy_search=True, verbose=True)[source]¶
Query the database for the object specified. By default will return the primary table, but this can be specified. Users can also request to resolve the object name via Simbad and query against all Simbad names.
- Parameters:
- namestr or list
Object name(s) to match
- output_tablestr
Name of table to match. Default: primary table (eg, Sources)
- resolve_simbadbool
Get additional names from Simbad. Default: False
- table_namesdict
Dictionary of tables to search for name information. Should be of the form table name: column name list. Default: {‘Sources’: [‘source’, ‘shortname’], ‘Names’: ‘other_name’}
- fmtstr
Format to return results in (pandas, astropy/table, default). Default is astropy table
- fuzzy_searchbool
Flag to perform partial searches on provided names (default: True)
- verbosebool
Output some extra messages (default: True)
- Returns:
- List of SQLAlchemy results
- search_string(value, fmt='table', fuzzy_search=True, verbose=True)[source]¶
Search an abitrary string across all string columns in the full database
- Parameters:
- valuestr
String to search for
- fmtstr
Format to return results in (pandas, astropy/table, default). Default is astropy table
- fuzzy_searchbool
Flag to perform partial searches on provided names (default: True)
- verbosebool
Output results to screen in addition to dictionary (default: True)
- Returns:
- Dictionary of results, with each key being the matched table names
- astrodbkit.astrodb.copy_database_schema(source_connection_string, destination_connection_string, sqlite_foreign=False, ignore_tables=[], copy_data=False)[source]¶
Copy a database schema (ie, all tables and columns) from one database to another Adapted from https://gist.github.com/pawl/9935333
- Parameters:
- source_connection_stringstr
Connection string to source database
- destination_connection_stringstr
Connection string to destination database
- sqlite_foreignbool
Flag to enable foreign key checks for SQLite; passed to
load_connection. Default: False- ignore_tableslist
List of tables to not copy
- copy_databool
Flag to enable copying data to the new database. Default: False
- astrodbkit.astrodb.create_database(connection_string, drop_tables=False, felis_schema=None)[source]¶
Create a database from a schema that utilizes the
astrodbkit2.astrodb.Baseclass. Some databases, eg Postgres, must already exist but any tables should be dropped. The default behavior is to assume that a schema with SQLAlchemy definitions has been imported prior to calling this function. If instead, Felis is being used to define the schema, the path to the YAML file needs to be provided to the felis_schema parameter (as a string).- Parameters:
- connection_stringstr
Connection string to database
- drop_tablesbool
Flag to drop existing tables. This is needed when the schema changes. (Default: False)
- felis_schemastr
Path to schema yaml file
- astrodbkit.astrodb.load_connection(connection_string, sqlite_foreign=True, base=None, connection_arguments={})[source]¶
Return session, base, and engine objects for connecting to the database.
- Parameters:
- connection_stringstr
The connection string to connect to the database. The connection string should take the form:
dialect+driver://username:password@host:port/database- sqlite_foreignbool
Flag to enable foreign key checks for SQLite. Default: True
- baseSQLAlchemy base object
Use an existing base class. Default: None (ie, creates a new one)
- connection_argumentsdict
Additional connection arguments, like {“check_same_thread”: False}. When using PostgreSQL, you may need to set {“options”: “-csearch_path=SCHEMA_NAME”} to set the schema you’ll be using
- Returns:
- sessionsession object
Provides a holding zone for all objects loaded or associated with the database.
- basebase object
Provides a base class for declarative class definitions.
- engineengine object
Provides a source of database connectivity and behavior.