mratsim/Apartment-Interest-Prediction
Predict people interest in renting specific NYC apartments. The challenge combines structured data, geolocalization, time data, free text and images.
Implements custom feature engineering pipelines with caching via `shelve` to overcome Scikit-Learn's limitations in handling non-leaky transformations across folds, combining XGBoost/LightGBM with HDBSCAN-based geo-clustering, TF-IDF+TruncatedSVD text decomposition, and cyclical time encoding via trigonometric projection. Custom cross-validation framework decouples feature transformations from fold computation, enabling efficient reuse of expensive NLP operations (sentiment analysis, metro vocabulary extraction) and categorical encodings across train/validation/test splits without data leakage.
No commits in the last 6 months.
Stars
18
Forks
1
Language
Jupyter Notebook
License
—
Category
Last pushed
Nov 04, 2017
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/mratsim/Apartment-Interest-Prediction"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
floydhub/regression-template
Build a deep learning model for predicting the price of wine given the description
PavelGrigoryevDS/olist-deep-dive
🌊 Deep Sales Analysis of Olist E-Commerce: EDA | Time Series| Viz | RFM | NLP | Geospatial |...
aws-samples/aws-esg-evaluation-handson
Machine Learning for ESG evaluation
kruts/Fake-Review-Detector
🔍 Detect fake product reviews using NLP techniques, TF-IDF, and Logistic Regression, with an...
fpozoc/ML-engineer-interview-task
Predicting rental prices with Machine Learning and Natural Language Processing