Aplikasi OCR Sederhana untuk Gambar ke Teks

Sering dapet data atau query dalam bentuk foto? Males ngetik ulang? Apalagi kalau panjang atau tabel? 😵‍💫Pakai tool ini! Paste gambar → langsung jadi teks Bisa copy atau save biar lebih gampang dipakai.

Deskripsi (Bahasa Indonesia)

Pic to text adalah aplikasi sederhana berbasis PyQt6 yang memungkinkan pengguna menempel (paste) gambar langsung ke UI, lalu mengekstrak teks menggunakan Tesseract OCR. Hasil OCR ditampilkan dalam bentuk teks dan dapat disimpan sebagai file .TXT untuk kemudahan penggunaan lebih lanjut.

Fitur

Paste gambar langsung dari clipboard
Menampilkan hasil OCR dalam tampilan teks
Menyimpan hasil ke format .TXT
Tombol reset untuk mengulang proses tanpa menutup aplikasi
Preprocessing gambar untuk meningkatkan akurasi OCR

Instalasi

1. Install Tesseract OCR

Untuk macOS (Homebrew):

brew install tesseract

Untuk Windows:

Download & install dari Tesseract OCR
Tambahkan path pytesseract.pytesseract.tesseract_cmd di script jika diperlukan

2. Install dependensi Python

pip install pytesseract opencv-python numpy pandas pyqt6

Cara Penggunaan

Jalankan aplikasi

python ptt.py

Paste gambar dari clipboard dengan menekan tombol Paste Image
Lihat hasil OCR dalam tampilan teks di UI
Simpan hasil ke .TXT menggunakan tombol Save to TXT
Gunakan tombol Reset untuk mengulang dari awal tanpa menutup aplikasi

Disclaimer

Mengingat akurasi OCR bergantung pada kualitas gambar, hasil teks mungkin tidak selalu sesuai dengan yang diharapkan. Jika teks yang diekstrak kurang akurat, coba gunakan gambar dengan kontras tinggi dan teks yang jelas.

Description (English)

OCR Table Extractor is a simple PyQt6-based application that allows users to paste images directly into the UI, then extract text using Tesseract OCR. The extracted text is displayed in the app and can be saved as a .TXT file for easy further use.

Features
• Paste images directly from clipboard
• Display OCR results in a text view
• Save results as a .TXT file
• Reset button to restart without closing the app
• Image preprocessing for better OCR accuracy

Installation

Install Tesseract OCR

For macOS (Homebrew):

brew install tesseract

For Windows:

Download & install from Tesseract OCR
Set pytesseract.pytesseract_cmd in the script if needed

2. Install Python dependencies

pip install pytesseract opencv-python numpy pandas pyqt6

How to Use

Run the application

python ptt.py

Paste an image from the clipboard by clicking the Paste Image button
View OCR results in the text display area
Save the results as a .TXT file with the Save to TXT button
Use the Reset button to start over without restarting the app

Disclaimer

OCR accuracy depends on image quality, and the extracted text may not always be perfect. If the OCR result is inaccurate, try using an image with high contrast and clear text.

sources code

import sys
import pytesseract
import pandas as pd
from PyQt6.QtWidgets import (QApplication, QWidget, QVBoxLayout, QPushButton, 
                             QLabel, QFileDialog, QTextEdit, QGraphicsView, QGraphicsScene, 
                             QGraphicsPixmapItem)
from PyQt6.QtGui import QPixmap, QImage
from PyQt6.QtCore import Qt
import cv2
import numpy as np

# Set path Tesseract di macOS
pytesseract.pytesseract.tesseract_cmd = "/opt/homebrew/bin/tesseract"

class OCRApp(QWidget):
    def __init__(self):
        super().__init__()
        self.init_ui()

    def init_ui(self):
        # Layout utama
        layout = QVBoxLayout()

        # Tombol Paste Gambar
        self.paste_button = QPushButton("Paste Image")
        self.paste_button.clicked.connect(self.paste_image)
        layout.addWidget(self.paste_button)

        # Area Tampilan Gambar (DIBESARIN)
        self.image_view = QGraphicsView()
        self.image_scene = QGraphicsScene()
        self.image_view.setScene(self.image_scene)
        self.image_view.setFixedSize(600, 400)  # Ukuran tampilan gambar diperbesar
        layout.addWidget(self.image_view)

        # Hasil OCR dalam bentuk teks
        self.result_text = QTextEdit()
        self.result_text.setReadOnly(True)
        self.result_text.setFixedSize(600, 400)
        layout.addWidget(self.result_text)

        # Tombol Save ke CSV revisi ke txt
        self.save_button = QPushButton("Save to TXT")
        self.save_button.clicked.connect(self.save_to_txt)
        layout.addWidget(self.save_button)

        # Tombol Reset
        self.reset_button = QPushButton("Reset")
        self.reset_button.clicked.connect(self.reset_app)
        layout.addWidget(self.reset_button)

        self.setLayout(layout)
        self.setWindowTitle("OCR Table Extractor")
        self.setGeometry(100, 100, 600, 500)

        self.image = None  # Variabel buat simpan gambar

    def paste_image(self):
        clipboard = QApplication.clipboard()
        pixmap = clipboard.pixmap()

        if not pixmap.isNull():
            self.display_image(pixmap)
            self.process_ocr()

    def display_image(self, pixmap):
        """ Menampilkan gambar di UI """
        self.image_scene.clear()
        self.image = pixmap
        item = QGraphicsPixmapItem(pixmap)
        self.image_scene.addItem(item)
        self.image_view.fitInView(item, Qt.AspectRatioMode.KeepAspectRatio)

    def process_ocr(self):
        """ Jalankan OCR dan tampilkan hasilnya """
        if self.image:
            # Konversi dari QPixmap ke OpenCV format
            image_qt = self.image.toImage()
            width, height = image_qt.width(), image_qt.height()
            ptr = image_qt.bits()
            ptr.setsize(height * width * 4)  # 4 bytes per pixel (RGBA)
            arr = np.array(ptr).reshape((height, width, 4))
            gray_image = cv2.cvtColor(arr, cv2.COLOR_RGBA2GRAY)

            # Preprocessing: Thresholding dan Noise Reduction untuk meningkatkan akurasi OCR
            gray_image = cv2.GaussianBlur(gray_image, (5, 5), 0)  # Mengurangi noise
            _, thresh_image = cv2.threshold(gray_image, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

            # Jalankan OCR
            ocr_result = pytesseract.image_to_string(thresh_image, config="--psm 6")

            # Simpan hasil di text edit
            self.result_text.setPlainText(ocr_result)

    def save_to_txt(self):
        """ Simpan hasil OCR ke TXT """
        text = self.result_text.toPlainText()
        if text.strip():
            # Pilih lokasi penyimpanan
            file_path, _ = QFileDialog.getSaveFileName(self, "Save TXT", "", "Text Files (*.txt)")
            if file_path:
                with open(file_path, "w", encoding="utf-8") as file:
                    file.write(text)
                self.result_text.setPlainText(f"Hasil OCR telah disimpan ke {file_path}")



    def reset_app(self):
        """ Reset aplikasi tanpa restart """
        self.image_scene.clear()
        self.result_text.clear()
        self.image = None

if __name__ == "__main__":
    app = QApplication(sys.argv)
    window = OCRApp()
    window.show()
    sys.exit(app.exec())