Predicting student academic performance using a machine learning approach. This project leverages the Random Forest Classifier to analyze student data and predict academic outcomes based on various factors such as grades, lifestyle, and social background.
- Project Overview
- Dataset Description
- Technologies Used
- Machine Learning Pipeline
- How to Run
- Model Evaluation
- Results
- Future Work
- License
- Contact
Academic performance prediction can help educators identify students who may need additional support. In this project, we build a Random Forest classification model to classify whether a student is likely to have a good or poor performance based on their demographic and academic features.
Replace this with your own dataset description or link.
- Source: Student Performance Dataset
- Attributes: Gender, Age, Study Time, Failures, Family Support, Internet Access, Absences, G1, G2, G3, etc.
- Target Variable: Final Grade or Performance Label (Pass / Fail)
- Language: Python
- Environment: Jupyter Notebook
- Libraries:
pandas,numpyfor data manipulationmatplotlib,seabornfor visualizationscikit-learnfor machine learning and model evaluation
-
Data Preprocessing
- Handling missing values
- Label encoding categorical variables
- Feature scaling (if needed)
-
Feature Selection
- Correlation analysis
- Feature importance from Random Forest
-
Model Building
- Splitting dataset into train/test
- Applying Random Forest Classifier
- Hyperparameter tuning (optional)
-
Evaluation
- Confusion Matrix
- Classification Report
- Accuracy, Precision, Recall, F1-Score
Install Python 3.12 and Jupyter Notebook.
git clone https://github.com/RAVINDRAN-S/Students-Performance-Prediction---Random-Forest
cd Students-Performance-Prediction---Random-Forestpip install pandas numpy matplotlib seaborn scikit-learnjupyter notebook "Student Performance Prediction - Random Forest.ipynb"- Metric Score
- Accuracy 0.995
- Precision 1.00
- Recall 0.96
- F1-Score 0.98
-
The Random Forest model provided a strong accuracy score of ~87%, outperforming other basic models.
-
Key influential features: G1, G2, studytime, failures, and absences.
This project is licensed under the MIT License.
Developer • ML Enthusiast • Neovim Customizer • Linux Power User
Hi! I'm Ravindran S, an engineering student passionate about:
- Linux & System Engineering
- AIML (Artificial Intelligence & Machine Learning)
- Full-stack Web Development
- Hackathon-grade project development
You can reach me here: