Next-Gen Tools for Next-Genomics
Modular bioinformatics toolkit for sequence analysis, parsing, ML, and visualization
Everything you need for bioinformatics research
Comprehensive tools for DNA/RNA sequence manipulation including reverse complement, motif search, GC content calculation, and translation.
Robust parsers for standard bioinformatics file formats with efficient memory usage and error handling.
Pre-built ML pipelines optimized for biological data analysis with feature extraction and model training.
Publication-quality plots and charts specifically designed for genomic and biological data presentation.
Comprehensive statistical tools for hypothesis testing, correlation analysis, and data exploration.
User-friendly, modular design that allows easy extension and customization for specific research needs.
Get started in seconds
$ pip install genomehouse
See GenomeHouse in action
from genomehouse import sequence_tools
# Calculate GC content
seq = "ATGCGTACGGCTA"
gc_content = sequence_tools.gc_content(seq)
print(f"GC Content: {gc_content}%")
# Get reverse complement
rev_comp = sequence_tools.reverse_complement(seq)
print(f"Reverse Complement: {rev_comp}")
# Find motifs
motifs = sequence_tools.find_motifs(seq, "GC")
print(f"GC motifs found at: {motifs}")
from genomehouse import genomic_parsers
# Parse FASTA file
for header, sequence in genomic_parsers.parse_fasta("data.fasta"):
print(f">{header}")
print(f"Length: {len(sequence)}")
print(f"GC%: {sequence_tools.gc_content(sequence)}")
# Parse FASTQ with quality scores
for record in genomic_parsers.parse_fastq("reads.fastq"):
print(f"ID: {record.id}")
print(f"Quality: {record.quality_score}")
from genomehouse import ml_tools
# Feature extraction from sequences
sequences = ["ATGCGT", "GCATGC", "TGCATG"]
features = ml_tools.extract_features(sequences)
# Train classification model
model = ml_tools.SequenceClassifier()
model.fit(features, labels)
# Predict new sequences
predictions = model.predict(new_sequences)
print(f"Predictions: {predictions}")
# Parse FASTA file
$ genomehouse-cli parse-fasta data/sample.fasta
# Calculate GC content
$ genomehouse-cli gc-content ATGCGTAC
GC Content: 50.0%
# Convert FASTQ to FASTA
$ genomehouse-cli convert reads.fastq output.fasta
# Generate sequence statistics
$ genomehouse-cli stats genome.fasta