mratsim/Apartment-Interest-Prediction

Predict people interest in renting specific NYC apartments. The challenge combines structured data, geolocalization, time data, free text and images.

12
/ 100
Experimental

Implements custom feature engineering pipelines with caching via `shelve` to overcome Scikit-Learn's limitations in handling non-leaky transformations across folds, combining XGBoost/LightGBM with HDBSCAN-based geo-clustering, TF-IDF+TruncatedSVD text decomposition, and cyclical time encoding via trigonometric projection. Custom cross-validation framework decouples feature transformations from fold computation, enabling efficient reuse of expensive NLP operations (sentiment analysis, metro vocabulary extraction) and categorical encodings across train/validation/test splits without data leakage.

No commits in the last 6 months.

No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 1 / 25
Community 5 / 25

How are scores calculated?

Stars

18

Forks

1

Language

Jupyter Notebook

License

Last pushed

Nov 04, 2017

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/mratsim/Apartment-Interest-Prediction"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.