Difference between revisions of "Twitter Analysis DB Details"

From OpenCircuits
Jump to navigation Jump to search
(Created page with "The main page for the project is '''[http://www.opencircuits.com/Twitter_Analysis_DB Twitter Analysis DB - OpenCircuits ]''' = General = Look at the GUI Link TBD and j...")
 
Line 11: Line 11:
  
 
= Building a Database =  
 
= Building a Database =  
 +
I am working on providing DB building facilities from the GUI.  Since this is sensitive to the input sources it only works with the type of input sources I have used.  Not everything is in the GUI as of this writing, this will probably change.
 +
 +
First to enable the GUI features you need to adjust the parameter file so 
 +
 +
* self.show_db_def  = True
 +
 +
Then you also need to point to the input files: ( these are in the github repo )
 +
 +
* self.tweet_input_file_name  = r"./input/all_tweets_may_16_for_2020.txt"  # where tweets are
 +
* self.word_input_file_name    = r"./input/english-word-frequency/unigram_freq.csv"  # word frequency data from kaggal
 +
 +
Then some processing options:
 +
 +
* self.who_tweets              = "djt"                                      # id for who tweets, not much used yet
 +
* self.use_spacy              = True                                        # processing words to lemmas
 +
 +
 +
 +
 +
 +
 +
 +
 +
  
  

Revision as of 09:29, 18 May 2020

The main page for the project is Twitter Analysis DB - OpenCircuits

General

Look at the GUI Link TBD and just try it out.


Debugging

Building a Database

I am working on providing DB building facilities from the GUI. Since this is sensitive to the input sources it only works with the type of input sources I have used. Not everything is in the GUI as of this writing, this will probably change.

First to enable the GUI features you need to adjust the parameter file so

  • self.show_db_def = True

Then you also need to point to the input files: ( these are in the github repo )

  • self.tweet_input_file_name = r"./input/all_tweets_may_16_for_2020.txt" # where tweets are
  • self.word_input_file_name = r"./input/english-word-frequency/unigram_freq.csv" # word frequency data from kaggal

Then some processing options:

  • self.who_tweets = "djt" # id for who tweets, not much used yet
  • self.use_spacy = True # processing words to lemmas