summaryrefslogtreecommitdiff
path: root/content/posts.org
diff options
context:
space:
mode:
Diffstat (limited to 'content/posts.org')
-rw-r--r--content/posts.org208
1 files changed, 0 insertions, 208 deletions
diff --git a/content/posts.org b/content/posts.org
deleted file mode 100644
index 12b6eb8..0000000
--- a/content/posts.org
+++ /dev/null
@@ -1,208 +0,0 @@
-#+hugo_base_dir: ../
-#+hugo_section: ./posts
-
-#+hugo_weight: auto
-#+hugo_auto_set_lastmod: t
-
-#+author: Roger Gonzalez
-
-* Programming :@programming:
-All posts in here will have the category set to /programming/.
-** How I got a residency appointment thanks to Python, Selenium and Telegram :python::selenium:telegram:
-:PROPERTIES:
-:EXPORT_FILE_NAME: how-i-got-a-residency-appointment-thanks-to-python-and-selenium
-:EXPORT_DATE: 2020-08-02
-:END:
-Hello everyone!
-
-As some of you might know, I'm a Venezuelan 🇻🇪 living in Montevideo, Uruguay 🇺🇾.
-I've been living here for almost a year, but because of the pandemic my
-residency appointments have slowed down to a crawl, and in the middle of the
-quarantine they added a new appointment system. Before, there were no
-appointments, you just had to get there early and wait for the secretary to
-review your files and assign someone to attend you. But now, they had
-implemented an appointment system that you could do from the comfort of your own
-home/office. There was just one issue: *there were never appointments available*.
-
-That was a little stressful. I was developing a small /tick/ by checking the
-site multiple times a day, with no luck. But then, I decided I wanted to do a
-bot that checks the site for me, that way I could just forget about it and let
-the computers do it for me.
-
-*** Tech
-**** Selenium
-I had some experience with Selenium in the past because I had to run automated
-tests on an Android application, but I had never used it for the web. I knew it
-supported Firefox and had an extensive API to interact with websites. In the
-end, I just had to inspect the HTML and search for the "No appointments
-available" error message. If the message wasn't there, I needed a way to be
-notified so I can set my appointment as fast as possible.
-**** Telegram Bot API
-Telegram was my goto because I have a lot of experience with it. It has a
-stupidly easy API that allows for superb bot management. I just needed the bot
-to send me a message whenever the "No appointments available" message wasn't
-found on the site.
-
-*** The plan
-Here comes the juicy part: How is everything going to work together?
-
-I divided the work into four parts:
-1) Inspecting the site
-2) Finding the error message on the site
-3) Sending the message if nothing was found
-4) Deploy the job with a cronjob on my VPS
-
-*** Inspecting the site
-Here is the site I needed to inspect:
-- On the first site, I need to click the bottom button. By inspecting the HTML,
- I found out that its name is ~form:botonElegirHora~
- [[/2020-08-02-171251.png]]
-- When the button is clicked, it loads a second page that has an error message
- if no appointments are found. The ID of that message is ~form:warnSinCupos~.
- [[/2020-08-02-162205.png]]
-
-*** Using Selenium to find the error message
-First, I needed to define the browser session and its settings. I wanted to run
-it in headless mode so no X session is needed:
-#+BEGIN_SRC python
-from selenium import webdriver
-from selenium.webdriver.firefox.options import Options
-
-options = Options()
-options.headless = True
-d = webdriver.Firefox(options=options)
-#+END_SRC
-
-Then, I opened the site, looked for the button (~form:botonElegirHora~) and
-clicked it
-#+BEGIN_SRC python
-# This is the website I wanted to scrape
-d.get('https://sae.mec.gub.uy/sae/agendarReserva/Paso1.xhtml?e=9&a=7&r=13')
-elem = d.find_element_by_name('form:botonElegirHora')
-elem.click()
-#+END_SRC
-
-And on the new page, I looked for the error message (~form:warnSinCupos~)
-#+BEGIN_SRC python
-try:
- warning_message = d.find_element_by_id('form:warnSinCupos')
-except Exception:
- pass
-#+END_SRC
-
-This was working exactly how I wanted: It opened a new browser session, opened
-the site, clicked the button, and then looked for the message. For now, if the
-message wasn't found, it does nothing. Now, the script needs to send me a
-message if the warning message wasn't found on the page.
-
-*** Using Telegram to send a message if the warning message wasn't found
-The Telegram bot API has a very simple way to send messages. If you want to read
-more about their API, you can check it [[https://core.telegram.org/][here]].
-
-There are a few steps you need to follow to get a Telegram bot:
-1) First, you need to "talk" to the [[https://core.telegram.org/bots#6-botfather][Botfather]] to create the bot.
-2) Then, you need to find your Telegram Chat ID. There are a few bots that can help
- you with that, I personally use ~@get_id_bot~.
-3) Once you have the ID, you should read the ~sendMessage~ API, since that's the
- only one we need now. You can check it [[https://core.telegram.org/bots/api#sendmessage][here]].
-
-So, by using the Telegram documentation, I came up with the following code:
-#+BEGIN_SRC python
-import requests
-
-chat_id = # Insert your chat ID here
-telegram_bot_id = # Insert your Telegram bot ID here
-telegram_data = {
- "chat_id": chat_id
- "parse_mode": "HTML",
- "text": ("<b>Hay citas!</b>\nHay citas en el registro civil, para "
- f"entrar ve a {SAE_URL}")
-}
-requests.post('https://api.telegram.org/bot{telegram_bot_id}/sendmessage', data=telegram_data)
-#+END_SRC
-
-*** The complete script
-I added a few loggers and environment variables and voilá! Here is the complete code:
-#+BEGIN_SRC python
-#!/usr/bin/env python3
-
-import os
-import requests
-from datetime import datetime
-
-from selenium import webdriver
-from selenium.webdriver.firefox.options import Options
-
-from dotenv import load_dotenv
-
-load_dotenv() # This loads the environmental variables from the .env file in the root folder
-
-TELEGRAM_BOT_ID = os.environ.get('TELEGRAM_BOT_ID')
-TELEGRAM_CHAT_ID = os.environ.get('TELEGRAM_CHAT_ID')
-SAE_URL = 'https://sae.mec.gub.uy/sae/agendarReserva/Paso1.xhtml?e=9&a=7&r=13'
-
-options = Options()
-options.headless = True
-d = webdriver.Firefox(options=options)
-d.get(SAE_URL)
-print(f'Headless Firefox Initialized {datetime.now()}')
-elem = d.find_element_by_name('form:botonElegirHora')
-elem.click()
-try:
- warning_message = d.find_element_by_id('form:warnSinCupos')
- print('No dates yet')
- print('------------------------------')
-except Exception:
- telegram_data = {
- "chat_id": TELEGRAM_CHAT_ID,
- "parse_mode": "HTML",
- "text": ("<b>Hay citas!</b>\nHay citas en el registro civil, para "
- f"entrar ve a {SAE_URL}")
- }
- requests.post('https://api.telegram.org/bot'
- f'{TELEGRAM_BOT_ID}/sendmessage', data=telegram_data)
- print('Dates found!')
-d.close() # To close the browser connection
-#+END_SRC
-
-Only one more thing to do, to deploy everything to my VPS
-
-*** Deploy and testing on the VPS
-This was very easy. I just needed to pull my git repo, install the
-~requirements.txt~ and set a new cron to run every 10 minutes and check the
-site. The cron settings I used where:
-#+BEGIN_SRC bash
-*/10 * * * * /usr/bin/python3 /my/script/location/registro-civil-scraper/app.py >> /my/script/location/registro-civil-scraper/log.txt
-#+END_SRC
-The ~>> /my/script/location/registro-civil-scraper/log.txt~ part is to keep the logs on a new file.
-
-*** Did it work?
-Yes! And it worked perfectly. I got a message the following day at 21:00
-(weirdly enough, that's 0:00GMT, so maybe they have their servers at GMT time
-and it opens new appointments at 0:00).
-[[/2020-08-02-170458.png]]
-
-*** Conclusion
-I always loved to use programming to solve simple problems. With this script, I
-didn't need to check the site every couple of hours to get an appointment, and
-sincerely, I wasn't going to check past 19:00, so I would've never found it by
-my own.
-
-My brother is having similar issues in Argentina, and when I showed him this, he
-said one of the funniest phrases I've heard about my profession:
-
-> /"Programmers could take over the world, but they are too lazy"/
-
-I lol'd way too hard at that.
-
-I loved Selenium and how it worked. Recently I created a crawler using Selenium,
-Redis, peewee, and Postgres, so stay tuned if you want to know more about that.
-
-In the meantime, if you want to check the complete script, you can see it on my
-Git instance: https://git.rogs.me/me/registro-civil-scraper or Gitlab, if you
-prefer: https://gitlab.com/rogs/registro-civil-scraper
-
-* COMMENT Local Variables
-# Local Variables:
-# eval: (org-hugo-auto-export-mode)
-# End: