Catching Linux keyboard events with python

Keyboard events is important not also for creating computer application based on keyboard events but also to ergonomics, keystroke static/dynamics and other scientific studies. In my case, for example, I use keyboard events to my Master Dissertation where Keystroke Dynamic is used to identify user profile based her/his behavior for computer security purpose. Over the past decades, many scientific studies were done in order to improve proposed methods. It is believed that first studies in keystroke using keyboard events took place in 1860s in the telegraph era, were identified operators by their unique “signature”.

To get my job done I created a mini-program in Python that I will post gradually. For now, I will talk about “Catching Linux Keyboard events” using python.

Audience basic required knowledge:

Flowchart

Here is a mermaid code used for flowchart:

graph TB
    A(Start)-->|Event happens|B{What Type?}
    B-->|Key Down|C[Save: tmp key down]
    C-.->|Key Up|D[Calc: tmp key down - key up]
    B-->|Key Auto|E[Save tmp key down]
    E-.->|Key Up Happens|D
    D-->F((End))

Code

The following code is made according with directives defines in Linux Kernel about input devices. see input directives and event directives

We will use QWETY Keyboard and ignore all other keys. In the keychar dictionary keys are keys and values are alpha characters.

keychar = {2: '1', 3: '2', 4: '3', 5: '4', 6: '5', 7: '6', 8: '7', 9: '8', 10: '9', 11: '0',
			16: 'q', 17: 'w', 18: 'e', 19: 'r', 20: 't', 21: 'y', 22: 'u', 23: 'i', 24: 'o',
			25: 'p', 30: 'a', 31: 's', 32: 'd', 33: 'f', 34: 'g', 35: 'h', 36: 'j', 37: 'k',
			38: 'l', 44: 'z', 45: 'x', 46: 'c', 47: 'v', 48: 'b', 49: 'n', 50: 'm'} # QWERTY chars

Where is keystroke data come from in most of the Linux based Operating Systems ? In general come from /dev/input/event0

import sys
device_name = "/dev/input/event" + (sys.argv[1] if len(sys.argv) > 1 else "0")

But results from this device is presented as binary… so, let’s format results:

FORMAT = 'llHHI'
import struct
EVENT_SIZE = struct.calcsize(FORMAT)

where llHHI means long int, long int, unsigned short, unsigned short and unsigned int. We should also import and use struct to give value to EVENT_SIZE.

According to event directives EV_KEY is used to describe state of keyboards or other key-like devices… So let’s define it:

EV_KEY = 0x01
KEY_DOWN = 1
KEY_AUTO = 2
KEY_UP = 0

Which mens we’ll wait for Event key 0x01 that is 1 and 1 is noted as KEY_DOWN. We also flow directives defining a int value for key down(when key is pressed), key up(when key is released) and key auto (that is like an press and hold). Key Auto is interrupted when same key is released or when another key is pressed. In this last case an other cycle(event) is started.

Now let’s create a function called time_calc that will handle times elapsed. We should assumes that we’ll use two types of time: second and microseconds regarding pressed and released time. This should answer this question: How long time I have pressed key.

def time_calc(second_p, microsecond_p, second_r, microsecond_r):
  pressed_time = (second_p+microsecond_p/1000000.) - (second_r+microsecond_r/1000000)
  return pressed_time

Defined staffs, let’s now open device as ‘rb’(READ BINARY), use formated event size and use GNU struct timeval regarding elapsed time define here to get keystroke(alpha character, key code, pressed time).

import time

with open(device_name, "rb") as source_data:
  code_time = {}
  while True:
    event = source_data.read(EVENT_SIZE)
    (tv_sec, tv_usec, etype, code, value) = struct.unpack(FORMAT, event)
    if etype == EV_KEY:
      if value = KEY_DOWN:
        code_time[code] = (tv_sec, tv_usec)
      if value == KEY_UP and code in code_time:
        keystroke = (code, time_calc(tv_sec, tv_usec, \*code_time[code]))

        if code in keychar.keys():
          print ("Key: %s, Key code: %u, Time pressed: %f" % \
                  (keychar[code], keystroke[0], keystroke[1]))

Results

Results should be like this:

Key: j, Key code: 36, Time pressed: 0.360123
Key: o, Key code: 24, Time pressed: 0.415161
Key: r, Key code: 19, Time pressed: 0.416548
Key: g, Key code: 34, Time pressed: 0.665365
Key: e, Key code: 18, Time pressed: 0.687727

Final Code:

import sys
import struct
import time

device_name = "/dev/input/event" + (sys.argv[1] if len(sys.argv) > 1 else "0")

FORMAT = 'llHHI'
EVENT_SIZE = struct.calcsize(FORMAT)
EV_KEY = 0x01
KEY_DOWN = 1
KEY_AUTO = 2
KEY_UP = 0

keychar = {2: '1', 3: '2', 4: '3', 5: '4', 6: '5', 7: '6', 8: '7', 9: '8', 10: '9', 11: '0',
            16: 'q', 17: 'w', 18: 'e', 19: 'r', 20: 't', 21: 'y', 22: 'u', 23: 'i', 24: 'o',
            25: 'p', 30: 'a', 31: 's', 32: 'd', 33: 'f', 34: 'g', 35: 'h', 36: 'j', 37: 'k',
            38: 'l', 44: 'z', 45: 'x', 46: 'c', 47: 'v', 48: 'b', 49: 'n', 50: 'm'}

def time_calc(sec_a, usec_a, sec_b, usec_b):
  time_pressed = (sec_a+usec_a/1000000.) - (sec_b+usec_b/1000000)
  return time_pressed

with open(device_name, "rb") as source_data:
  code_time = {}
  while True:
    event = source_data.read(EVENT_SIZE)
    (tv_sec, tv_usec, etype, code, value) = struct.unpack(FORMAT, event)
    if etype == EV_KEY:
      if value == KEY_DOWN:
        code_time[code] = (tv_sec, tv_usec)
        if value == KEY_UP and code in code_time:
          keystroke = (code, time_calc(tv_sec, tv_usec, \*code_time[code]))

          if code in keychar.keys():
            print ("Key: %s, Key code: %u, Time pressed: %f" % \
                    (keychar[code], keystroke[0], keystroke[1]))

Next post in this category:

In the next post we’ll improve the code, try to save results in JSON file and create a function to compare temporary values with saved values.

EOF