File:CLUSTER COMPUTING FOR AUTOMATED NETWORK ANALYSIS AT SCALE (IA clustercomputing1094559618).pdf

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
Go to page
next page →
next page →
next page →

Original file(1,275 × 1,650 pixels, file size: 2.16 MB, MIME type: application/pdf, 106 pages)

Captions

Captions

Add a one-line explanation of what this file represents

Summary[edit]

CLUSTER COMPUTING FOR AUTOMATED NETWORK ANALYSIS AT SCALE   (Wikidata search (Cirrus search) Wikidata query (SPARQL)  Create new Wikidata item based on this file)
Author
Brida, Benjamin J.
image of artwork listed in title parameter on this page
Title
CLUSTER COMPUTING FOR AUTOMATED NETWORK ANALYSIS AT SCALE
Publisher
Monterey, CA; Naval Postgraduate School
Description

Conventional single node packet analyzers are unable to monitor network traffic at scale. In this thesis, elements of the Apache Hadoop ecosystem, including HBase, Spark, and MapReduce, are employed to conduct network traffic analysis on a large collection of network traffic. Limited analysis is conducted directly on packet capture next generation (pcapng) files on the Hadoop Distributed File System (HDFS) using MapReduce. Next, to allow for repeated analysis on the same dataset without reading all source files in their entirety for every calculation, pcapng files are parsed and relevant meta-data is bulk loaded into HBase, a Not Only Structured Query Language (NoSQL) database employing the HDFS for parallelization. This NoSQL database is then accessed via Apache Spark where pertinent data is loaded into DataFrames and additional analysis on the network traffic takes place. This research demonstrates the viability of custom, modular, automated analytics, employing open-source software to enable parallelization, to conduct traffic analysis at scale.


Subjects: big data; MapReduce; Hadoop; packet capture; traffic analysis; network analysis; Spark; HBase
Language English
Publication date June 2018
Current location
IA Collections: navalpostgraduateschoollibrary; fedlink
Accession number
clustercomputing1094559618
Source
Internet Archive identifier: clustercomputing1094559618
https://archive.org/download/clustercomputing1094559618/clustercomputing1094559618.pdf
Permission
(Reusing this file)
This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.

Licensing[edit]

Public domain
This work is in the public domain in the United States because it is a work prepared by an officer or employee of the United States Government as part of that person’s official duties under the terms of Title 17, Chapter 1, Section 105 of the US Code. Note: This only applies to original works of the Federal Government and not to the work of any individual U.S. state, territory, commonwealth, county, municipality, or any other subdivision. This template also does not apply to postage stamp designs published by the United States Postal Service since 1978. (See § 313.6(C)(1) of Compendium of U.S. Copyright Office Practices). It also does not apply to certain US coins; see The US Mint Terms of Use.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current18:55, 15 July 2020Thumbnail for version as of 18:55, 15 July 20201,275 × 1,650, 106 pages (2.16 MB) (talk | contribs)FEDLINK - United States Federal Collection clustercomputing1094559618 (User talk:Fæ/IA books#Fork8) (batch 1993-2020 #11510)

Metadata