crawler controller
I work on a project where I have written 20+ crawlers and the crawlers are running 24/7 (with good amount of sleep). Sometimes, I need to update / restart the server. Then I have to start all the crawlers again. So, I have written a script that will control all the crawlers. It will first check if the crawler is already running, and if not, then it will start the crawler and the crawler will run in the background. I also saved the pid of all the crawlers in a text file so that I can kill a particular crawler immediately when needed. Here is my code : import shlex from subprocess import Popen, PIPE site_dt = {'Site1 Name' : ['site1_crawler.py', 'site1_crawler.out'], 'Site2 Name' : ['site2_crawler.py', 'site2_crawler.out']} location = "/home/crawler/" pidfp = open('pid.txt', 'w') def is_running(pname): p1 = Popen(["ps", "ax"], stdout=PIPE) p2 = Popen(["grep", pname...