Blog Post #3

Roman Ibrahimov and Yifei Hu are going to work together on a robitics project. Roman is an expert in Swarm Control and has already published several papers in this domain. Yifei focuses more on the Natural Language Processing direction which could be potential way for human-robot interaction.

Currently, the team is working on creating the environment and implementing some basic commands. We have decided to use Python for both the NLP part and the robotics part.

On the Robotics side:

We have integrated the Webots simulator with ROS2. In the attached video, we guide a swarm of three E-puck robots. The robots subscribe to a Python-based topic publisher, which publishes control inputs simultaneously.

Note: if you are having trouble viewing the video in this page, please click here to redirect to the source page.

On the NLP side:

Our current goal is to parse and map 4 basic commands from natural language to explicit functions. The 4 commands are: TurnLeft(degree), TurnRight(degree), MoveForward(distance), MoveBackward(distance).

If a human gives us a sentence in natural language and we need to determine which (1 out of 4) command should be executed, how would we do that? Here we want to introduce a pre-trained language model called "BERT-mrpc". This language model uses the transformer architechure and can compare the semantic similarity of two given sentences. Below is an example of using the model:

def determine_paraphrase(sent1, sent2): print("sentence 1: "+sent1) print("sentence 2: "+sent2) paraphrase = tokenizer(sent1, sent2, return_tensors="pt") paraphrase_classification_logits = model(**paraphrase, return_dict=True).logits paraphrase_results = torch.softmax(paraphrase_classification_logits, dim=1).tolist()[0] for i in range(len(classes)): print(f"{classes[i]}: {int(round(paraphrase_results[i] * 100))}%")

The return value will be two probabilities: [32%, 68%] (similar, not similar)

Mapping is only one of the steps we need to do on the NLP side. We also need to extract the parameters from a sentence. For example, if the user says "please go straight for 10 feet." We must extract "10 feet" from the sentence and convert it to the parameter with proper unit (could be m or cm).

Extracting parameters and converting to certain unit is a tricky problem. We can certainly use some POS (part-of-speech) taggers to identify the numbers or directions. However, we do not have much data to validate our idea. For now, we might have to ask the users to use explicit numbers and certain units in their natural language commands.

A completed NLP parsing and mapping function would look like the following:

def mapping_and_parsing(input_sentence): for s in existing_commands): compare_similarity(input_sentence, s) # find the best match # ...... # ...... Parameter = Parsing(input_sentence) return [best_match, Parameter]

So far, we can only process 4 basic commands, but we are moving toward the correct direction with curiosity and excitement!