Skip to content
Snippets Groups Projects
Commit d40c6cf4 authored by Tarek Shah's avatar Tarek Shah
Browse files

folder for summarization stuff

parent ae852bc7
No related branches found
No related tags found
No related merge requests found
File suppressed by a .gitattributes entry or the file's encoding is unsupported.
import torch
from transformers import AutoTokenizer, AutoModelWithLMHead
tokenizer = AutoTokenizer.from_pretrained('t5-base')
model = AutoModelWithLMHead.from_pretrained('t5-base', return_dict=True)
with open('summarizer_input.txt', 'r', encoding='utf-8') as file:
text = file.read()
inputs = tokenizer.encode("summarize: " + text, return_tensors='pt', max_length=512, truncation=True)
summary_ids = model.generate(inputs, max_length=150, min_length=80, length_penalty=5., num_beams=2)
summary = tokenizer.decode(summary_ids[0])
print(summary)
\ No newline at end of file
File suppressed by a .gitattributes entry or the file's encoding is unsupported.
import nltk
from nltk.corpus import stopwords
nltk.download('stopwords')
nltk.download('punkt')
from nltk.tokenize import word_tokenize, sent_tokenize
with open('summarizer_input.txt', 'r', encoding='utf-8') as file:
text = file.read()
stopWords = set(stopwords.words("english"))
words = word_tokenize(text)
freqTable = dict()
for word in words:
word = word.lower()
if word in stopWords:
continue
if word in freqTable:
freqTable[word] += 1
else:
freqTable[word] = 1
sentences = sent_tokenize(text)
sentenceValue = dict()
for sentence in sentences:
for word, freq in freqTable.items():
if word in sentence.lower():
if word in sentence.lower():
if sentence in sentenceValue:
sentenceValue[sentence] += freq
else:
sentenceValue[sentence] = freq
sumValues = 0
for sentence in sentenceValue:
sumValues += sentenceValue[sentence]
average = int(sumValues / len(sentenceValue))
summary = ''
for sentence in sentences:
if (sentence in sentenceValue) and (sentenceValue[sentence] > (1.2 * average)):
summary += " " + sentence
print(summary)
\ No newline at end of file
Watch CBS News
February 14, 2017 / 7:20 PM EST
/ CBS/AP
The man believed to have bought rifles for San Bernardino shooter Syed Rizwan Farook agreed to plead guilty to conspiring to provide materials for terrorists, the Department of Justice announced on Tuesday.
Enrique Marquez Jr., a longtime friend of Farook's, will plead guilty to providing material support and resources to terrorists, including weapons, explosives and personnel. As part of the plea agreement, he admitted that he conspired with Farook to carry out two attacks that never happened. Marquez also will plead guilty to making false statements to investigators.
Farook and his wife, Tashfeen Malik, opened fire at a holiday party for health inspectors at the Inland Regional Center on Dec. 2, 2015. Fourteen people were killed in the attack and 24 people were wounded. Farook and Malik were killed in a shootout with police shortly afterward. The pair declared their allegiance on Facebook to the Islamic State (ISIS) shortly before the attack.
Prosecutors said Marquez acknowledged being a “straw buyer” when he purchased two assault rifles from a sporting goods store that were used in the attack at the Inland Regional Center in San Bernardino. Prosecutors have said Marquez agreed to buy the weapons because the attackers feared Farook's Middle Eastern appearance might arouse suspicion.
Marquez also admitted to plotting with Farook in 2011 and 2012 to massacre college students and gun down motorists on a gridlocked California freeway, though those attacks never occurred.
Federal officials said the duo had envisioned halting traffic on state Route 91 with explosives and then firing at trapped motorists, or tossing pipe bombs into a crowded cafeteria at Riverside City College.
Marquez said he backed out of the plot after four men in the area about 60 miles inland from Los Angeles were arrested on terrorism charges in late 2012, the FBI has said in court documents.
“While his earlier plans to attack a school and a freeway were not executed, the planning clearly laid the foundation for the 2015 attack on the Inland Regional Center,” U.S. Attorney Eileen M. Decker said Tuesday.
In January, three people -- Farook's brother Syed Raheel Farook, his wife, Tatiana Farook, and her sister, Mariya Chernykh -- pleaded guilty to immigration fraud. They admitted to being part of a conspiracy to arrange a sham marriage between Marquez and Chernykh, the Department of Justice said.
Marquez is scheduled to appear before a federal judge on Thursday. .
First published on February 14, 2017 / 7:20 PM EST
© 2017 CBS Interactive Inc. All Rights Reserved. This material may not be published, broadcast, rewritten, or redistributed. The Associated Press contributed to this report.
Copyright ©2024 CBS Interactive Inc. All rights reserved.
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment