File:FREQUENCY-BASED FEATURE EXTRACTION FOR MALWARE CLASSIFICATION (IA frequencybasedfe1094561360).pdf

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
Go to page
next page →
next page →
next page →

Original file(1,275 × 1,650 pixels, file size: 1.86 MB, MIME type: application/pdf, 58 pages)

Captions

Captions

Add a one-line explanation of what this file represents

Summary[edit]

FREQUENCY-BASED FEATURE EXTRACTION FOR MALWARE CLASSIFICATION   (Wikidata search (Cirrus search) Wikidata query (SPARQL)  Create new Wikidata item based on this file)
Author
Erwert, Jonathan P.
image of artwork listed in title parameter on this page
Title
FREQUENCY-BASED FEATURE EXTRACTION FOR MALWARE CLASSIFICATION
Publisher
Monterey, CA; Naval Postgraduate School
Description

Traditional signature-based malware detection is effective, but it can only identify known malicious programs. This thesis attempts to use machine-learning techniques to successfully identify previously unknown malware from a set of Windows executable programs. We analyzed the histogram of 4-, 8-, and 16-bit-sequence values contained in each program. We then analyzed the effectiveness of using these histograms in part or in full as feature vectors for machine learning experiments. We also explored the effect of an offset at the beginning of each program and its impact on classifier performance. We successfully show that a machine learning classifier can be learned from these features, with an f-measure in excess of 90% attained in one of our experiments. Using a part of the histogram as the feature vector did not significantly affect classifier performance up to a point, nor did including an offset. Our results also suggest that features derived from histograms are better suited to tree-based algorithms compared to Bayesian methods.


Subjects: machine learning; malware analysis; static analysis
Language English
Publication date December 2018
Current location
IA Collections: navalpostgraduateschoollibrary; fedlink
Accession number
frequencybasedfe1094561360
Source
Internet Archive identifier: frequencybasedfe1094561360
https://archive.org/download/frequencybasedfe1094561360/frequencybasedfe1094561360.pdf
Permission
(Reusing this file)
This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.

Licensing[edit]

Public domain
This work is in the public domain in the United States because it is a work prepared by an officer or employee of the United States Government as part of that person’s official duties under the terms of Title 17, Chapter 1, Section 105 of the US Code. Note: This only applies to original works of the Federal Government and not to the work of any individual U.S. state, territory, commonwealth, county, municipality, or any other subdivision. This template also does not apply to postage stamp designs published by the United States Postal Service since 1978. (See § 313.6(C)(1) of Compendium of U.S. Copyright Office Practices). It also does not apply to certain US coins; see The US Mint Terms of Use.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current18:33, 20 July 2020Thumbnail for version as of 18:33, 20 July 20201,275 × 1,650, 58 pages (1.86 MB) (talk | contribs)FEDLINK - United States Federal Collection frequencybasedfe1094561360 (User talk:Fæ/IA books#Fork8) (batch 1993-2020 #17055)

Metadata