Pavel Vasilyev

LLM Engineer / AI Infrastructure Specialist
Jerusalem, Israel marchdown@gmail.com +972-54-343-1123 marchdown pavel-vasilyev-65105b149

Summary

LLM engineer with 14 years ML/NLP experience and 3 years specializing in production language model systems.

Expert in LLM workflows, safety guardrails, prompt engineering, and scaling generative AI infrastructure.

Strong functional programming background (Clojure, Elixir, Rust) with production Python, Java, and R experience.

MSc Computational Linguistics, BSc Nuclear Physics.

Experience

Co-Founder & Technical Lead

founder
Fintilligence • Remote Aug 2024 – Dec 2024
Built fintech analytics platform for detecting asymmetric information flow using PIN/VPIN market microstructure measures
  • Implemented PIN/VPIN (Probability of Informed Trading/Volume-Synchronized PIN) measures for market microstructure analysis
  • Designed and deployed data acquisition pipelines processing Level 2 market data (order book, bid-ask spreads, trade flow)
  • Managed PostgreSQL database architecture for high-frequency trade and quote data storage
  • Coordinated small team (3 people) across research, development, and UI design
  • Built investment advisory system highlighting potential information asymmetries and trading opportunities
Python PostgreSQL pandas

CTO

contract
LevEhat NGO • Remote Mar 2024 – Jul 2024
Led technical operations for civic tech nonprofit, managing cloud migration and platform development
  • Managed migration from Google Cloud Platform to AWS infrastructure (cost optimization)
  • Coordinated UI designers and developers (team of 5) for volunteer management platform
  • Oversaw database architecture for volunteer tracking, task assignment, and activity logging
  • Established development priorities and technical roadmap for platform evolution
Python PostgreSQL

Independent Consultant

consulting
Various Clients • Remote 2022 – 2023
ML and data consulting: model development, pipeline architecture, statistical analysis
  • Projects: time-series forecasting, text classification, data pipeline optimization
Python

Senior Researcher

full-time
Spring Research • Remote 2020 – 2021
ML models for trading signal generation using Level 2 market data and topology-inspired approaches
  • Developed ML models for trading signal generation using time-series analysis and statistical methods
  • Built data pipelines processing Level 2 market data (tick-by-tick, order book, market depth)
  • Researched topology-inspired approaches to market microstructure modeling
  • Implemented backtesting infrastructure for strategy evaluation
Python pandas numpy scikit-learn

Data Scientist

full-time
Nestlogic • Remote 2019
Computer vision models for advertising optimization, A/B testing infrastructure, production ML deployment
  • Built computer vision models for advertising creative optimization (image feature extraction)
  • Implemented A/B testing infrastructure using statistical hypothesis testing (t-tests, chi-square)
  • Deployed ML models to production on Google Cloud Platform with Kubernetes
  • Developed analytics dashboards tracking model performance and business KPIs
Python PyTorch OpenCV

Data Scientist

full-time
Maverick Medical AI • Remote 2018
Medical NLP system for clinical entity recognition, ontology frameworks, HIPAA-compliant data handling
  • Developed NLP system for medical named entity recognition in clinical text using spaCy and BiLSTM
  • Built medical ontology framework for standardizing terminology across different hospital systems
  • Created decision support tools for clinical workflows highlighting critical findings
  • Worked within HIPAA compliance requirements for healthcare data
Python spaCy PyTorch Flask

Full-Stack Engineer

full-time
Athena Portfolio Solutions • Remote 2017
Full-stack financial NLP platform extracting signals from news and SEC filings using knowledge graphs
  • Developed Java backend services for data processing and entity recognition
  • Implemented entity linking system connecting market events to portfolio positions
  • Built sentiment analysis models for earnings calls and analyst reports
  • Created knowledge graph of financial entities (companies, people, events, relationships)
Java Python Neo4j

Independent Tutor & Consultant

consulting
Private Practice • Remote / International Relocation 2016 – 2017
Technical tutoring and consulting during international relocation period (Moscow → Prague → US)
  • Mathematics, statistics, programming, and computational linguistics instruction
  • ML/NLP consulting for various clients

Technical Tutor

intermittent
Private Practice • Remote 2017 – Present
Ongoing mathematics, statistics, programming, and computational linguistics tutoring
  • Students: high school through graduate level, plus professional colleagues
  • Peak activity during 2020 pandemic period

Technical Skills

Programming
  • Python (10+ years production)
  • Clojure (10+ years)
  • Elixir (2+ years)
  • Rust (2+ years)
  • Java, R, SQL (advanced), bash
  • Functional background: Common Lisp, Haskell, Go
ML/AI
  • Frameworks: PyTorch, TensorFlow, Keras, scikit-learn, XGBoost, LightGBM, CatBoost
  • NLP: spaCy, NLTK, Hugging Face Transformers
  • Deep Learning: CNNs, RNNs, LSTMs, GRUs, Transformers, attention mechanisms, transfer learning, fine-tuning
Cloud & Infrastructure
  • AWS: EC2, S3, SageMaker, Lambda
  • GCP: GCE, GCS, GKE
  • Containers: Docker, Kubernetes
Databases
  • PostgreSQL (expert)
  • MongoDB, Redis, Neo4j, SQLite, MySQL
Data Engineering
  • Apache Spark, Apache Airflow, Kafka
  • ETL pipelines, data quality, orchestration
Web & APIs
  • Flask, FastAPI, Django
  • REST APIs, microservices architecture
DevOps
  • git, Linux, CI/CD (GitHub Actions, Jenkins)
  • Monitoring, logging
Domain Expertise
LLM Production Systems
3 years • Stamina AI
Therapeutic chatbots, safety systems, prompt engineering, RAG pipelines
Quantitative Finance
2+ years • Fintilligence, Spring Research
Level 2 market data, PIN/VPIN algorithms, order flow analysis, backtesting
Healthcare AI
1 year • Maverick Medical AI
Medical NER, clinical terminology, HIPAA compliance, decision support tools
Financial NLP
1 year • Athena Portfolio Solutions
SEC filings analysis, knowledge graphs, entity linking, sentiment analysis

Education

MSc Computational Linguistics

2016
Russian State University for the Humanities (RSUH) • Moscow, Russia
Statistical NLP, Machine Translation, Information Extraction

BSc Nuclear Physics

2011
Czech Technical University • Prague, Czech Republic
Mathematical Modeling, Statistical Analysis, Computational Physics

Languages

English (Fluent), Russian (Native), French (Conversational), German (Conversational), Czech (Conversational), Hebrew (Basic)