IBM Bluemix has a cloud offering that does image recognition. I played around with it and it is really fantastic, but I don't know if it offers the level of customization you would need. From what I saw it is more like from their object database. Still, if you have a few spare moment, it is worth taking a look at... it is kind of like Google Goggles.
I also recall seeing some API offerings at mashape.com that did image recognition. Once again, I don't know if it will allow you to add your own customization to detect person A and person B.
Quite possible the above posts were a waste of your time
Finally, the French system SARAH will do this. So I know it can be done. Although it requires a Kinect, and proper training (pretty cool how it works). And heavily utilizes system resources on the PC running it.
Finally, and this
may be your most promising prospect,
http://docs.opencv.org/modules/contrib/doc/facerec/facerec_tutorial.html.
I agree with you it would be great to have the ability for your house to know who is in it and where. Even better to know when someone unknown is in it and react. But I am will to bet the implementation of such a system will either require $$$ or heavy system resource utilization. Running a Kinect with full recognition turned on ramps the CPU way up.
I think it would be really cool to have the home system know my wife is downstairs in the living room, and I'm upstairs in the bedroom. I could say 'Ask Jill if she wants to go out for dinner' and the downstairs system would output my message to her, she could respond, and my system would announce back to me. All because, one way or another, the system is able to determine our location (I realize there are various ways this could be accomplished).