Through the power of deep and machine learning, faster CPUs, and new types of sensors, computers can now see, hear, feel, smell, taste, and speak. All these senses take the form of some kind of sensor (like a camera) and some kind of mathematical algorithm, usually a supervised machine learning algorithm and a model.
Here is what is available.
See: image and facial recognition
Recent research into image and facial recognition lets computers not only detect object presence but detect multiple instances of similar objects. Facebook and Google have really been leading the way here, with multiple open source releases. Facebook has stated that it has a goal of detecting things in video.
Source: New feed