The Navy wants to teach robots to teach themselves in a learning-by-doing experiment

Navy

The Navy wants to teach robots to teach themselves in a learning-by-doing experiment

Can robots teach themselves new tricks? In theory they can, according to researchers at the U.S. Naval Research Laboratory. In a new white paper, they lay out how...

Tom Temin@tteminWFED

November 30, 2022 2:41 pm

10 min read

Best listening experience is on Chrome, Firefox or Safari. Subscribe to Federal Drive’s daily audio interviews on Apple Podcasts or PodcastOne.

Can robots teach themselves new tricks? In theory they can, according to researchers at the U. S. Naval Research Laboratory. In a new white paper, they lay out how robots, like people, can learn by using a curriculum and a learning agenda. To learn how they will test that theory, the Federal Drive with Tom Temin spoke with research scientists Laura Hiatt and Mark Roberts from the U.S. Naval Research Laboratory.

Interview transcript:

Tom Temin: All right. So tell us first of all, before we get into what you’re doing with robots, you make a pretty strong distinction in this document, between curricula and learning agendas. And that’s something that actually federal human beings are dealing with right now with a learning agenda for improving customer experience and all of this, what is the learning agenda? And how does it differ from a curriculum?

        More endpoints, more data — how can the government keep both safe? Find out in our latest ebook, sponsored by Tanium Federal and Carahsoft.

Laura Hiatt: So this is an important distinction in our work a curriculum as we define it, it’s similar to what we all encounter when we, for example, go to school. So a teacher or a coach or some other professional determines the order in which we learn things. Usually, this is from easy to hard, for example, a scaffolding of skills that lead up to a gymnast learning a backflip or learning a new sport like racquetball. And this order of learning things allows students to kind of quickly ramp up their skill set and learn what they need to know to achieve some level of competency like reaching a certain level of gymnastics, making an advanced sports team, etc. And learning agenda, however, is when the person decides for themselves what they need to learn, and practice to accomplish their goals. So they can do this, for example, based on what skills they already are comfortable with, and then expanding from there. And one important implication of the students setting their own learning agenda is that they can evaluate and adjust at any time, even in the middle of a practice session, what skill they work on next, based on their progress.

Tom Temin: OK, I understand, I would say maybe another analogy might be and you can tell me if this is accurate. If you are a pianist, the score is the curriculum, but how you work on your trills and your arpeggios and your scales. That’s your learning agenda.

Laura Hiatt: Yeah, that’s one way of thinking of it, perhaps like a learning book, right could be the curriculum that you decide what order you practice the songs in and what skills you repeat over and over again.

Tom Temin: Sure. OK, and Mak so robots. And first of all, what type of robot are you thinking of to demonstrate this, because there are software robots, bots, RPA. And all of this, then there are robots in the Navy contexts which walk around with fire extinguishers.

Mark Roberts: These platforms will be fairly straightforward. We’re going to start with sort of research platforms as a starting point. And the research will focus on a rolling robot first, and then one called the stretch, and then eventually, to more complicated robots.

Tom Temin: These the humanoid types of robots that you see, I mean, robots, sometimes it’s just an articulating arm. And sometimes it walks on its two feet?

Mark Roberts: Maybe eventually, in the project, we’re hoping to get to that point, at the start, we’ll use just the sort of base with an articulating arm like you’re talking about.

Tom Temin: OK, and then tell us how a learning agenda can work with a robot which has a brain in the sense of a bunch of microprocessors, but it’s not really a thinking machine.

        Read more: Navy

Laura Hiatt: So that’s kind of the crux of the project. So one of the persistent challenges in robotics, kind of, as you’ve alluded to, is the high cost of programming robots to be generally capable. And this project will help to overcome that by allowing robots to, as we said, to perform kind of self guided learning to help themselves become more capable. And so the first part is letting robots set their own learning agenda, like what we mentioned above, the robot can start with what it knows how to do, however, that’s represented in its computer, and kind of work its way to harder and harder tasks adapting as it goes to maximize learning. And that does involve the robot being able to make a determination of which tasks are incrementally harder than one another. But one of the new and exciting parts of this project is that we’re also using goals as part of this structure of learning, in addition to just moving from kind of easy to hard tasks. So for example, a robot performing a task for itself, like picking up a coffee mug might look different than a robot performing a task followed by another action in service of another person. So thinking of picking up a coffee mug and then handing it over to somebody else. So that you’re doing technically the same task, you’re just picking up the coffee mug, but the goal is different. And that makes it a little bit harder. So that’s one of the main things here is that we’re using goals to structure the learning by realizing that the goals can and should affect the learning agenda as well.

Tom Temin: We’re speaking with Laura Hiatt and Mak Roberts, they’re scientists at the U.S. Naval Research Laboratory. What I’m hearing is there’s an implication that there is some artificial intelligence, or even robotic process automation within a robot. And is that something new in the robotic world such that it can teach itself and change what it does over time?

Laura Hiatt: Yeah, there is a big intersection between robotics and artificial intelligence when we talk about kind of the robot’s metaphorical brain. So absolutely, the techniques that we’ll be using here are machine learning, goal reasoning, which is a type of automated planning. And so those are known AI techniques that we’re going to be leveraging for use on our robot.

Tom Temin: And tell us about the experiment, the specific tasks that you plan for the robots.

Laura Hiatt: So we’re going to be focusing on to begin with on opening a door. So this is a task that seems simple, especially as people because people are very adept at going around in the world. But if you think about it, doors have a lot of different knobs. Some doors don’t have any doorknobs, some doors, you push some doors, you pull, there’s a lot of variability there. So let’s say for example, that we have a robot that knows how to push a swinging door open. So perhaps next, it might learn to open a door where you have to push a bar to get through, that would be kind of the next level of difficulty. Or maybe it’ll try one after that with a handle where you have to kind of use your thumb or some other digit for a robot to push the top down to open it. So that’s kind of moving through doors at different levels of difficulty. And then this can be interspersed with opening the door for different goals. So opening a door, just check if it’s locked is maybe a simple goal for this. Perhaps more difficult is holding an open for someone else to walk through. And then even harder might be holding the door open while you carry something fragile or spillable as the robot walks through. So it constructor it’s learning. So it’s iteratively doing more complicated doors, and for more complicated goals until it has generally mastered the skill of opening doors.

Tom Temin: And what about the idea of holding up a CAC card to unlock the door and then knowing it’s now unlocked, and now you can push the crash bar?

Laura Hiatt: Yeah, that might be the gold standard there. What we’ll have to see so far robots don’t have CACs, but.

        Want to stay up to date with the latest federal news and information from all your devices? Download the revamped Federal News Network app

Tom Temin: Well, this is something that you developed in cooperation with the Naval Research Laboratory, the NRL, do you have funding for this and what was the source of that?

Mark Roberts: So this is a three year project. It’s funded by the basic research office as part of the LUCI Program. That’s the LUCI stands for theLaboratory University Collaborative Initiative. And the purpose of that program is to pair national labs with researchers at top tier universities. So in this project, we’ll be collaborating with two professors who are leaders in artificial intelligence research. Professor Dana Nau at the University of Maryland is an expert in automated planning, who has co-written two textbooks on the subject and led the effort to develop the kind of planning technology that Laura was talking about a moment ago, that we will use in this project. And we will be working with Professor George Conan-Doris from Brown University, who’s an expert in reinforcement learning, and has worked on developing techniques for learning abstractions that facilitate faster learning for actions and for perception.

Tom Temin: And have you constructed a type of laboratory with lots of doors that the robot can successively march through with a cam?

Mark Roberts: Not yet, we will be doing that as part of the project. It should start, we hope in December. But yeah, that’ll be part of the project. There are some physical places at the NRL, where we already have some researchers doing pieces that are similar to that with other kinds of doors or hatches, so hopefully be able to leverage the work that they’ve done.

Laura Hiatt: Yeah. And it turns out even in our offices, there are plenty of types of doors to get started with.

Tom Temin: Yeah, that’s right. And maybe you can learn to jump through hoops after that, you know, and then cut red tape, and you’d really have a good government robot. But let me ask you this. Are there implications for future applications in door opening? Or is door opening simply the type of problem that can really test the theory of self learning? And then you could apply it to other activities? Or maybe a little of both?

Laura Hiatt: Yeah, it’s the second one. So right now we’re interested in just the general ability of using a learning agenda and door opening is what we’re focusing on to kind of really figure out how that would look and what it would end up being. Of course, having a robot that can learn to open all kinds of doors is also a capability that many robots would also benefit from.

Tom Temin: Alright, so this research really is opening doors in many levels.

Laura Hiatt: Yes, you could say that.

Tom Temin: All right. Well, I’d like to come see that in operation one of these days. Laura Hiatt and Mak Roberts are scientists at the U.S. Naval Research Laboratory. Thanks so much for joining me and good luck.

Laura Hiatt: Thank you. We’re happy to be here.

Mark Roberts: And thanks for having us here. We appreciate your interest in the project. And we want to just thank our sponsors the Basic Research Office who’s funding this as well as other sponsors who have led to research that led to this project, namely the Air Force Office of Scientific Research and the Office of Naval Research.