You can enter any prompt, such as finding a certain location through Google Maps or booking a ticket for your next trip. After users enter their prompt, all they have to do is wait while the Hugging Face Open Computer Agent is doing its job and opens all the required programs to complete that specific task.
It should be mentioned that in the beginning, the Hugging Face AI agent is able to handle simpler tasks at an impressive speed. However, with more complex prompts, it seems to struggle and make mistakes in some cases.
Even if is currently free to use and easily accessible, the company announces that users will need to wait a few seconds or minutes before testing, similar to a virtual queue, according to the number of other user’s requests.
“As vision models become more capable, they become able to power complex agentic workflows”, a member of the Hugging Face agents team stated in his X post.
But it seems that the biggest goal isn’t the idea of building the most capable and advanced AI agent. Instead, by launching this Open Computer Agent, the company wants to highlight the fact that artificial intelligence technology has truly evolved, and it’s able to do impressive things.
“Especially Qwen-VL models, that support built-in grounding, i.e. ability to locate any element in an image by its coordinates.”, he also stated.
It’s true that these kinds of AI agents need more time to be improved and developed to handle more complex tasks, but it seems that more and more companies redirect their interest toward them.
Stay tuned for more updates!