Real-time Credit card Fraud Detection using Spark 2.2
- 4.1
Brief Introduction
Real time Credit card Fraud detection using Spark Streaming, Spark ML, Kafka, Cassandra and AirflowDescription
Real-time Credit card Fraud Detection is implemented using Spark Kafka and Cassandra.
Spark ML Pipeline Stages like String Indexer, One Hot Encoder and Vector Assembler is used for Pre-processing
Machine Learning model is created using the Random Forest Algorithm
Data balancing is done using K-means Algorithm
Integration of Spark Streaming Job with Kafka and Cassandra
Exactly-once semantics is achieved using Spark Streaming custom offset management
Airflow Automation framework is used to automate Spark Jobs on Spark Standalone Cluster.
Requirements
- Requirements
- Spark Streaming, Spark ML, Kafka, Cassandra, Programming IDE like Intellij or Eclipse, Java, Scala