Qt Data Cleaner

Interactive desktop tool for cleaning and transforming datasets. Built with Python, PyQt5, and pandas.

Features

File Support: Load and export CSV and Excel files (.csv, .xlsx, .xls)
Column Profiling: View detailed statistics for each column including:
- Data type
- Count of values
- Missing values and percentage
- Unique values
- Statistics for numeric columns (mean, std, min, max)
Missing Value Handling: Multiple strategies for dealing with missing data:
- Drop rows with missing values
- Fill with mean, median, or mode
- Forward fill or backward fill
- Fill with custom constant value
Data Transformations:
- Normalize columns to 0-1 range
- Standardize columns (Z-score normalization)
- Label encode categorical columns
- Drop duplicate rows
- Reset dataframe index
Undo/Redo Support: Full history tracking with undo/redo functionality
Data Preview: Interactive table view with alternating row colors and missing value highlighting
Export Pipeline: Save cleaned data to CSV or Excel format

Installation

Requirements

Python 3.7 or higher
pip package manager

Setup

Clone the repository:

git clone https://github.com/BaseMax/qt-data-cleaner.git
cd qt-data-cleaner

Install dependencies:

pip install -r requirements.txt

Usage

Running the Application

python main.py

Or make it executable and run directly:

chmod +x main.py
./main.py

Quick Start Guide

Open a Dataset:
- Click "File" → "Open..." or use Ctrl+O
- Select a CSV or Excel file
- The data will be displayed in the table view
View Column Profile:
- The right panel shows detailed statistics for each column
- Missing values are highlighted in red in the table
Handle Missing Values:
- Click "Data" → "Handle Missing Values..." or use the toolbar button
- Select a fill method (mean, median, mode, etc.)
- Choose which columns to apply the operation to
- Click OK to apply
Transform Data:
- Click "Data" → "Transform..." or use the toolbar button
- Select a transformation (normalize, standardize, encode, etc.)
- Choose columns if applicable
- Click OK to apply
Undo/Redo:
- Use "Edit" → "Undo" (Ctrl+Z) to revert changes
- Use "Edit" → "Redo" (Ctrl+Shift+Z) to reapply changes
Export Data:
- Click "File" → "Export..." or use Ctrl+S
- Choose output format (CSV or Excel)
- Save the cleaned dataset

Sample Data

A sample dataset (sample_data.csv) is included with the repository for testing purposes. It contains employee data with some missing values.

Keyboard Shortcuts

Ctrl+O: Open file
Ctrl+S: Export file
Ctrl+Z: Undo
Ctrl+Shift+Z: Redo
F5: Refresh profile
Ctrl+Q: Quit application

Architecture

The application is structured into several components:

main.py: Application entry point
main_window.py: Main GUI window and user interface
data_model.py: Data management with undo/redo support
transformers.py: Data transformation utilities

Dependencies

PyQt5: GUI framework
pandas: Data manipulation and analysis
numpy: Numerical computing
openpyxl: Excel file support
scikit-learn: Data preprocessing and transformations

License

MIT License - see LICENSE file for details.

Author

Max Base

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Qt Data Cleaner

Features

Installation

Requirements

Setup

Usage

Running the Application

Quick Start Guide

Sample Data

Keyboard Shortcuts

Architecture

Dependencies

License

Author

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
FEATURES.md		FEATURES.md
LICENSE		LICENSE
README.md		README.md
data_model.py		data_model.py
main.py		main.py
main_window.py		main_window.py
requirements.txt		requirements.txt
sample_data.csv		sample_data.csv
transformers.py		transformers.py

License

BaseMax/qt-data-cleaner

Folders and files

Latest commit

History

Repository files navigation

Qt Data Cleaner

Features

Installation

Requirements

Setup

Usage

Running the Application

Quick Start Guide

Sample Data

Keyboard Shortcuts

Architecture

Dependencies

License

Author

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages