Pavel Vasilyev

LLM Engineer / AI Infrastructure Specialist
Jerusalem, Israel • marchdown@gmail.com • +972-54-343-1123
github.com/marchdownlinkedin.com/in/pavel-vasilyev-65105b149

Profile

LLM engineer with 14 years ML/NLP experience and 3 years specializing in production language model systems. Expert in LLM workflows, safety guardrails, prompt engineering, and scaling generative AI infrastructure. Strong functional programming background (Clojure, Elixir, Rust) with production Python, Java, and R experience. MSc Computational Linguistics, BSc Nuclear Physics.

Experience

Co-Founder & Technical Lead

Fintilligence • Remote • Aug 2024–Dec 2024
  • Built full-stack fintech analytics platform for detecting asymmetric information flow in financial markets
  • Implemented PIN/VPIN (Probability of Informed Trading/Volume-Synchronized PIN) measures for market microstructure analysis
  • Designed and deployed data acquisition pipelines processing Level 2 market data (order book, bid-ask spreads, trade flow)
  • Managed PostgreSQL database architecture for high-frequency trade and quote data storage
  • Coordinated small team (3 people) across research, development, and UI design
  • Built investment advisory system highlighting potential information asymmetries and trading opportunities
  • Tech stack: Python, pandas, PostgreSQL, Flask, AWS (EC2, S3), market data APIs

CTO (Contract)

LevEhat NGO • Remote • Mar 2024–Jul 2024
  • Led technical operations for civic tech nonprofit focused on volunteer coordination
  • Managed migration from Google Cloud Platform to AWS infrastructure (cost optimization)
  • Coordinated UI designers and developers (team of 5) for volunteer management platform
  • Oversaw database architecture for volunteer tracking, task assignment, and activity logging
  • Established development priorities and technical roadmap for platform evolution
  • Tech stack: AWS (EC2, S3), Python, PostgreSQL, React (managed, didn't write)

AI Architect (Contract)

Stamina AI • Remote • Jan 2023–Dec 2023
  • Architected and deployed one of the early therapeutic chatbot systems for mental health support
  • Built complete LLM pipeline: prompt engineering, context management, response generation, safety guardrails
  • Integrated OpenAI GPT-3.5/4 APIs with custom safety layers and content filtering
  • Designed conversation state management and session handling for therapeutic context
  • Set up production infrastructure: API gateway, load balancing, monitoring, logging
  • Coordinated with consulting psychotherapists to ensure clinical appropriateness of responses
  • Managed small dev team (2-3 developers) implementing mobile and web interfaces
  • Implemented usage analytics and conversation quality monitoring dashboards
  • Tech stack: Python, OpenAI API, Flask, PostgreSQL, Redis (caching), AWS, Docker, Kubernetes

Independent Consultant

Various Clients • Remote • 2022–2023
  • ML and data consulting for various clients: model development, pipeline architecture, statistical analysis
  • Projects included: time-series forecasting, text classification, data pipeline optimization

Senior Researcher

Spring Research • Remote • 2020–2021
  • Developed ML models for trading signal generation using time-series analysis and statistical methods
  • Built data pipelines processing Level 2 market data (tick-by-tick, order book, market depth)
  • Researched topology-inspired approaches to market microstructure modeling
  • Implemented backtesting infrastructure for strategy evaluation
  • Collaborated with quantitative research team on experimental high-frequency strategies
  • Tech stack: Python, pandas, numpy, scikit-learn, AWS, PostgreSQL

Data Scientist

Nestlogic • Remote • 2019–2019
  • Built computer vision models for advertising creative optimization (image feature extraction)
  • Implemented A/B testing infrastructure using statistical hypothesis testing (t-tests, chi-square)
  • Deployed ML models to production on Google Cloud Platform with Kubernetes
  • Developed analytics dashboards tracking model performance and business KPIs
  • Tech stack: Python, PyTorch, OpenCV, GCP, Kubernetes, PostgreSQL

Data Scientist

Maverick Medical AI • Remote • 2018–2018
  • Developed NLP system for medical named entity recognition in clinical text using spaCy and BiLSTM
  • Built medical ontology framework for standardizing terminology across different hospital systems
  • Created decision support tools for clinical workflows highlighting critical findings
  • Worked within HIPAA compliance requirements for healthcare data
  • Tech stack: Python, spaCy, PyTorch, Flask, PostgreSQL, R, Mathematica

Full-Stack Engineer

Athena Portfolio Solutions • Remote • 2017–2017
  • Built full-stack financial NLP platform extracting signals from news and SEC filings
  • Developed Java backend services for data processing and entity recognition
  • Implemented entity linking system connecting market events to portfolio positions
  • Built sentiment analysis models for earnings calls and analyst reports
  • Created knowledge graph of financial entities (companies, people, events, relationships)
  • Tech stack: Java, Python, spaCy, NLTK, scikit-learn, Neo4j, R, Maxima

Independent Tutor & Consultant

Private Practice • Remote / International Relocation • 2016–2017
  • Technical tutoring and consulting during international relocation period (Moscow → Prague → US)
  • Mathematics, statistics, programming, and computational linguistics instruction
  • ML/NLP consulting for various clients

Technical Tutor (Intermittent)

Private Practice • Remote • 2017–Present
  • Ongoing mathematics, statistics, programming, and computational linguistics tutoring
  • Students: high school through graduate level, plus professional colleagues
  • Peak activity during 2020 pandemic period

Education

MSc Computational Linguistics

Russian State University for the Humanities (RSUH) • Moscow, Russia • 2016
Statistical NLP, Machine Translation, Information Extraction

BSc Nuclear Physics

Czech Technical University • Prague, Czech Republic • 2011
Mathematical Modeling, Statistical Analysis, Computational Physics

Technical Skills

Primary Skills
• LLM Engineering (3 years production): OpenAI GPT-3.5/4, Anthropic Claude, prompt engineering, RAG pipelines, context management, safety guardrails, content filtering, production deployment, latency optimization, monitoring
• Programming: Python (10+ years production), Clojure (10+ years), Elixir (2+ years), Rust (2+ years), Java, R, SQL (advanced), bash; functional background (Common Lisp, Haskell, Go)
• ML/AI: PyTorch, TensorFlow, Keras, spaCy, NLTK, Hugging Face Transformers, scikit-learn, XGBoost, LightGBM, CatBoost
• Deep Learning: CNNs, RNNs, LSTMs, GRUs, Transformers, attention mechanisms, transfer learning, fine-tuning
Additional Skills
Domain Experience
• LLM Production Systems: 3 years building therapeutic chatbots, safety systems, prompt engineering, RAG pipelines (Stamina AI)
• Quantitative Finance: Level 2 market data, PIN/VPIN algorithms, order flow analysis, backtesting (Fintilligence, Spring Research)
• Healthcare AI: Medical NER, clinical terminology, HIPAA compliance, decision support tools (Maverick Medical AI)
• Financial NLP: SEC filings analysis, knowledge graphs, entity linking, sentiment analysis (Athena Portfolio Solutions)

Languages

English (Fluent), Russian (Native), French (Conversational), German (Conversational), Czech (Conversational), Hebrew (Basic)