Job Description:
• Identify predictive patterns in alternative textual and tabular data sources
• Mine time‑series and cross‑sectional relationships in high‑frequency and daily tabular data
• Build and compare NLP architectures (transformers, embeddings, topic & sentiment models)
• Develop statistical and machine‑learning models (linear factor, tree‑based, gradient boosting, neural nets) that combine text‑derived features with numeric factors
• Construct robust, transaction‑cost‑aware back‑tests
• Partner with Data Engineering to scale data pipelines and feature stores
• Work with Portfolio Engineering to integrate signals into systematic strategies and monitor live performance
• Present findings to senior leadership; contribute to Stormlight’s research culture through white‑papers, internal talks, and code reviews
Requirements:
• Education – M.S. or Ph.D. in Computer Science, Statistics, Physics, Electrical Engineering, Applied Math, or a related quantitative field
• Programming – Expert‑level Python (pandas, NumPy, PyTorch or TensorFlow, scikit‑learn); solid SQL; version control (git)
• NLP & ML – Hands‑on experience training and fine‑tuning large language models, embeddings, and classical NLP pipelines; strong grasp of supervised learning, regularization, cross‑validation, and hyper‑parameter optimization
• Data Handling – Comfort manipulating TB‑scale datasets; proficiency with Spark, Dask, or comparable distributed frameworks
• Research Rigor – Track record of designing repeatable experiments, performing thorough statistical validation, and communicating uncertainty
• Communication – Ability to translate complex technical concepts into clear, actionable insights for stakeholders
Benefits:
• Bonus based on performance
• Flexible schedule
• Home office stipend
• Opportunity for advancement
• Paid time off
• Profit sharing
• Signing bonus