Editing Twitter Analysis DB

Jump to navigation Jump to search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
 
= Goal =
 
= Goal =
This document could use improvement, but the software largely works ( see Status below ) and I am probably done unless there is expressed interest in knowing more, or I add major new features. Look at history tab to see what is going on in the document. If you would rather look at at the application than read about it see [[Twitter Analysis DB GUI]] which you should check out at some point in any case.
+
This document is new, the software is alpha but "works" Look at history tab to see what is going on.  
  
Twitter Analysis DB is a Python open source, program and an accompanying database, running in a Graphical User Interface tool ( and/or database creation tool ) for the analysis of a body of tweets.  Currently the program is in early alpha and its design goals are evolving at least as fast as the code is being written.
+
Twitter Analysis DB is a Python open source, program and an accompanying database ( and/or database creation tool ) for analysis of a body of tweets.  Currently the program is in early alpha and its design goals are evolving at least as fast as the code is being written.
  
 
The point:
 
The point:
Line 19: Line 19:
 
I  will try to documented well enough so people can relatively easily extend and adapt the program.  Or as alternative they can use other tools with the database like SQLiteStudio. It should be fairly easy to download and use even for those without a desire to dive into the code but, I assume some knowledge of Python, and a Python Environment to run it in.  In Python 3.6 or so.   
 
I  will try to documented well enough so people can relatively easily extend and adapt the program.  Or as alternative they can use other tools with the database like SQLiteStudio. It should be fairly easy to download and use even for those without a desire to dive into the code but, I assume some knowledge of Python, and a Python Environment to run it in.  In Python 3.6 or so.   
  
See the graphical user interface here ( with screen shot ): '''[[Twitter Analysis DB GUI]]'''.
+
See the graphical user interface here ( with screen shot ): [[Twitter Analysis DB GUI]].
  
 
This application is also part of a family of applications see the category below: Python Projects.
 
This application is also part of a family of applications see the category below: Python Projects.
  
This is an article started by Russ Hensel, see '''"http://www.opencircuits.com/index.php?title=Russ_hensel#About My Articles"'''   About My Articles for a bit of info.
+
This is an article started by Russ Hensel, see "http://www.opencircuits.com/index.php?title=Russ_hensel#About My Articles"    About My Articles for a bit of info.
  
Code will be at GitHub, see '''[[https://github.com/russ-hensel/twitter_analysis_db Code at GitHub]]''' See the GUI here at '''[[Twitter Analysis DB GUI]]'''
+
Code will be at GitHub, see [[https://github.com/russ-hensel/python_smart_terminal Code at GitHub]]  See the GUI here at [[Twitter Analysis DB GUI]]
  
 
== Status ==
 
== Status ==
Line 32: Line 32:
 
* Overall structure seems sound and extensible.
 
* Overall structure seems sound and extensible.
 
* Should be relatively easy to add additional queries, joins, columns, select criteria, without massive coding effort.
 
* Should be relatively easy to add additional queries, joins, columns, select criteria, without massive coding effort.
* But.... it is full of opportunities for enhancement. Right now my interests have shifted so I may not do much further workPossibilits for improvement:
+
* But.... it is full of rough edges. Almost nothing has been polished upCited for improvement:
** Clean up tweet in the database build stage.  Pretty good not but still some odd "words" get through.
+
** Clean up tweet in the database build stage.  Much "junk" like odd Unicode characters need to be managed.
 
** User interface is evolving but still not as user friendly as I would like.
 
** User interface is evolving but still not as user friendly as I would like.
** Selects == (also know as Reports or Queries... ) are more demos of what is possible than what is truly useful and informative, several are experiments in the technology of the application.
+
** Report == Selects are more demos of what is possible than what is truly useful and informative.
** Sqllite still doing ok at 4 years of trump tweets and 300k of words.  
+
** Biggest db so far has 300K words and only Trump tweets for this year.  Need to do a bigger db load, see how sql lite holds up.
** No database optimizations yet.... I run on ram drive for speed. DB is about 40 MBytes with 4 years of trump tweets
+
** No database optimizations yet.... I run on ram drive for speed
** Report formatting is basic, but workable.  Nicest overall format for human readability is probably "html", best to pass to other applications is probably "csv", most responsive in time is "msg"  -- sent to message area, often sub second response.
+
** DB is about 20 MBytes so not so bad
 +
** Report formatting is basic, but workable.  Nicest overall format is probably "html", most responsive in time is "msg"  -- sent to message area  
 
** Not sure what area of work is most useful, have been driven lately by programming challenges need to focus for a bit on improving usefulness.
 
** Not sure what area of work is most useful, have been driven lately by programming challenges need to focus for a bit on improving usefulness.
** Still printing some unnecessary junk used in debugging, remove most... if output is needed sent to py_log, but whole logging parts of the application could use a careful review ( not happening soon ).
+
** Still printing lots of junk used in debugging, remove most... if output is needed send to py_log
  
'''What technical knowledge should users have ( and How ):'''
+
'''Who should use this program and How:'''
  
 
{| class="wikitable"
 
{| class="wikitable"
Line 53: Line 54:
 
|Person with little programming experience, no interest in Python.  Looking for download and install.
 
|Person with little programming experience, no interest in Python.  Looking for download and install.
 
|Probably should use another program.
 
|Probably should use another program.
|Not well suited to use this, but I may try to build an exe at some point.
+
|Not well suited to use this, I do not plan to fix this.
 
<!-------------------------------->
 
<!-------------------------------->
 
<!-------------------------------->
 
<!-------------------------------->
Line 64: Line 65:
 
|Modest Python experience
 
|Modest Python experience
 
|Modify all over the place, save data to database ......
 
|Modify all over the place, save data to database ......
|Program should be well documented in source, with some supplement in this wiki, or ask the author.
+
|Program should be well documented in soruce, with some supplement in this wiki, or ask the author.
 
<!--------------------------------
 
<!--------------------------------
 
|-valign="top"
 
|-valign="top"
Line 117: Line 118:
 
== Download ==
 
== Download ==
  
Code coming at GitHub, see [[https://github.com/russ-hensel/twitter_analysis_db GitHub Repository]] ( it is Python and you can run directly from the source ) Email me if you have issues ( use this link [[User:Russ_hensel]] ).
+
Code comming at GitHub, see [[https://github.com/russ-hensel/twitter_analysis_db GitHub Repository]] ( it is Python and you can run directly from the source ) Email me if you have issues ( use this link [[User:Russ_hensel]] ).
 
You will get a zip file, unzip it and you should get:
 
You will get a zip file, unzip it and you should get:
  
Line 123: Line 124:
 
     .... whatever --|
 
     .... whatever --|
 
                     |
 
                     |
                     |-- twitter_analysis_db --- all code required to run the application ( not sure if smart_terminal or python_smart_terminal or nothing is top level name, just put it in some well named place )
+
                     |-- tbd----------- --| -> all code required to run the application ( not sure if smart_terminal or python_smart_terminal or nothing is top level name, just put it in some well named place )
                                              |    some logs from my running of the code may or may not be present, these will be deleted as they creep in, when you run the program you will
+
                                          |    some logs from my running of the code may or may not be present, these will be deleted as they creep in, when you run the program you will
                                              |    get your own log files ... all typically named xxx.py_log  
+
                                          |    get your own log files ... all typically named xxx.py_log  
                                              |
+
                                          |
                                              | --> input      ---  input files used to build the database.
+
                                          | -- images -> image files, mostly screen shots, icons... or what ever, not important for the code.
                                              | --> output    ---  files produced by the database selects.
+
                                          | -- wiki_etc -> various files documenting program, including at least some of the material from this wiki  
                                              | --> images    ---  image files, mostly screen shots, icons... or what ever, not important for the code.
 
                                              | --> wiki_etc   ---  various files documenting program, including at least some of the material from this wiki  
 
                                              |                    also some sample output files
 
                                              | --> help      ---  help files, documentation for various selects
 
                                              | --> resources  ---  source code for the HTML module used in the application see [[Python Installation]]                                         
 
  
Put them in your system making "....whatever" anything convenient for your Python installation ( that is move the files to where you keep your Python source, not your installed module location ).   
+
Put them in your system making "....whatever" anything convenient for your Python ( that is move the files to where you keep your Python source ).   
  
Note that there may be a certain amount of left over, dead code, in the directories I am cleaning out bit by bit, someday it may be nice and neat.  For now if you want to tinker look at the design info below first.
+
Note that there may be a certain amount of left over, dead code, in the directory I am cleaning out bit by bit, someday it may be nice and neat.  For now if you want to tinker look at the design info below first.
 
 
I have not yet made a requirements.txt or any installation routines.  Run as you would any source code until all imports work.  A couple of modules ( '''spacy''' and ''HTML'' ) proved a bit difficult.  I have directions for these in: '''[[Python Installation]]'''.
 
  
 
== Run ==
 
== Run ==
  
Run it until it stops complaining about dependencies ( in the console ), after that ( and perhaps even before ) the GUI should come up.  You are installed. ( Also see note above about '''[[Python Installation]]'''. )
+
Run it until it stops complaining about dependencies ( in the console ), after that ( and perhaps even before ) the GUI should come up.  You are installed.
  
I have run the program on both Windows 10.  It should work in most OS's but this is untested.  Let me know about issues.
+
I have run the program on both Windows 10 and Rasperian on a RPi.  It should work in most OS's.  Let me know about issues.
  
 
= Configure to Run =
 
= Configure to Run =
Line 178: Line 172:
 
*Now when you run it the button <Edit Parms> should let you edit the parameters.py file.  Edit it and save.
 
*Now when you run it the button <Edit Parms> should let you edit the parameters.py file.  Edit it and save.
  
*Hit the <Restart> button and in a flash ( more or less ) the program should restart with the new parameters, starting is fast because previously imported material does not need to be re-imported.
+
*Hit the <Restart> button.  In a flash the program should restart with the new parameters, starting is fast because previously imported material does not need to be reimported.
  
  
Line 206: Line 200:
 
*[[Python Desk Top Applications]]
 
*[[Python Desk Top Applications]]
 
*[https://github.com/russ-hensel/twitter_analysis_db GitHub Repository]
 
*[https://github.com/russ-hensel/twitter_analysis_db GitHub Repository]
*[[Python Installation]]
 
  
 
<!-----------
 
<!-----------

Please note that all contributions to OpenCircuits may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see OpenCircuits:Copyrights for details). Do not submit copyrighted work without permission!

Cancel Editing help (opens in new window)