top of page

Christopher Csiszar

Data Science | Math | Finance | Business | Economics

Christopher_Csiszar_Beware_Default_Random_Forest_Importances.png

Beware Default Random Forest Importances

  • Unreliable Default Feature Importances: The default feature importance strategies in scikit-learn’s Python implementation and R’s Random Forest do not provide reliable results when predictor variables vary in scale or number of categories.

  • Use Permutation Importance: To ensure accurate feature importance, use permutation importance. In Python, this can be done using the rfpimp package. In R, set importance=T in the Random Forest constructor and type=1 in the importance() function.

  • Preferred Strategies for All Models: For all machine learning models, prefer permutation or drop-column importance strategies over interpreting internal model parameters as proxies for feature importances.

Christopher Csiszar in a restaurant.

An Easy to Use Waterfall Chart Function for Python

  • Effective Visualization Tool: Waterfall charts are excellent for visualizing marginal value contributions to a system or initial value, providing clear insights into changes over time.

  • User-Friendly Python Package: This package offers a hassle-free way to generate waterfall charts in Python, enhancing data range reliability, appearance, and chart options.

  • Versatile Applications: Waterfall charts can creatively display various data, from revenue and expenses to any system’s marginal contributions, making them universally applicable.

Christopher_Csiszar_Web_Scraping.png

Internal Charles River Associates Presentation

  • Setting Up for Web Scraping: Use an Integrated Development Environment (IDE) to ensure your web scraping projects are organized, repeatable, and efficient. Your IDE should have a source code editor, a project files directory, and a Python console.

  • Basic Commands for Web Scraping: Learn essential Python commands for web scraping, such as pausing scripts, storing text from web elements, printing output, and creating or writing to files.

  • Introduction to Web Scraping: Web scraping involves extracting unstructured data from websites and transforming it into structured data for analysis. Python, with its rich library ecosystem, is highly recommended for web scraping projects.

bottom of page