README

Backround

This project is part of BGU University's Distributed System Programming course, Assignment 2.
The project is map-reduce algorithm.
Implementation with Java, Amazon Web Services (AWS) and Hadoop framework.
Instructions Assignment 2

Overview

In this assignment you will generate a knowledge-base for Hebrew word-prediction system, based on Google 3-Gram Hebrew dataset, using Amazon Elastic Map-Reduce (EMR).
Outputs Examples

How to run

Configure AWS credentials in your machine.
Create S3 bucket with the name specified at App line 25.
Create a jar for each step (5 steps). When creating a JAR file, ensure that the META-INF/MANIFEST.MF file specifies the appropriate main class.
Using the file system change the name of the jars to: Step1, Step2 ... (exact name)
At the S3 bucket create a jar folder.
Upload the jars to <bucketName>/jars.
For Demo: arbix.txt file is in the <bucketName>. This file used as an example input.
Run App.
Output will be in <bucketName>/outputs/ after a successful run.

Bucket Structure At Start:

Bucket Jars Structure At Start:

Note: make sure that the S3 bucket doesn't include output or log folder pre-run.

Logic Diagram

Diagram PDF
draw.io Diagram

Requested report

Step 5 final output

Final step5 output file

Step 5 final output

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.idea		.idea
resources		resources
src/main/java		src/main/java
.gitignore		.gitignore
AWS.iml		AWS.iml
InputExample.txt		InputExample.txt
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README

Backround

Overview

How to run

Logic Diagram

Requested report

Final step5 output file

About

Contributors 2

Languages

AmitNG2000/AWS-EMR-Knowledge-Base

Folders and files

Latest commit

History

Repository files navigation

README

Backround

Overview

How to run

Logic Diagram

Requested report

Final step5 output file

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2

Languages