2 posts tagged with "WTT-02" | WHERE TRUE Technologies Documentation

WTT-02 Preview Release

May 3, 2023 · 2 min read

Trent Hauck

Developer

We are excited to announce the preview release of our latest tool, WTT-02, designed specifically for Cheminformatics users. WTT-02 is the second major tool in the WHERE TRUE Tools suite and comes packed with a range of powerful features to simplify your work.

What is WTT-02?

WTT-02 is a Cheminformatics tool that provides a range of features to help users streamline their workflows. With WTT-02, you can:

Input and output SDF files with glob and compression support.
Easily featurize machine learning workflows using chemical descriptors, Morgan fingerprints, and other related features.
Subset datasets by substructure or fingerprint similarity.
Get Within SQL ETL support for PubChem datasets.

For more information see the documentation.

A Minimal Example

Imagine you have a set of SDF files that you would like to filter based on a substructure and fingerprint similarity, featurize using Morgan fingerprints and molecular descriptors, and finally, write to parquet for use in a machine learning workflow. With WTT-02, you can perform all of these tasks in a single query.

COPY (
    SELECT _Name as name, smiles, features.*
    FROM (
        SELECT featurize(smiles) AS features, _Name, smiles
        FROM read_sd_file('*.sdf')
        WHERE substructure(smiles, 'c1ccccc1') AND tanimoto_similarity(smiles, 'c1ccccc1') > 0.7
    )
) TO 's3://my-bucket/my-file.parquet' (FORMAT PARQUET);

name	smiles	mw	fsp3	n_lipinski_hba	n_lipinski_hbd	n_rings	n_hetero_atoms	n_heavy_atoms	morgan_fp
name1	c1ccccc1	78.04695	0.0	0	0	1	0	6	[false, ...]
name2	c1ccccc1	46.041866	1.0	1	1	0	1	3	[false, ...]
name2	c1ccccc1	18.010565	0.0	1	2	0	1	1	[false, ...]

And with that you have a featurized dataset ready for machine learning or your data warehouse.

Also, say you already have your data in a postgres database, see our guide for using querying postgres with Exon-DuckDB. The same idea applies and you can quickly export data based on substructure or similarity

Install Improvements

March 13, 2023 · One min read

Trent Hauck

Developer

Over the last week, we've made a couple of improvements to how Exon-DuckDB is distributed. We've added Google login, and we've made it easier to install Exon-DuckDB.

It's pretty simple. If you go to the login page, there are two new buttons for login and sign-up, respectively. Give it a shot if you're so inclined: https://wheretrue.com/login.

Exon-DuckDB on PyPI

Again, pretty simple: $ pip install exondb. For more information on how to use, see the documentation and/or installation instructions.

What is WTT-02?​

A Minimal Example​

Google Login​

Exon-DuckDB on PyPI​

What is WTT-02?

A Minimal Example

Google Login

Exon-DuckDB on PyPI