The key to the separation you're looking for is keeping the microphone(s) close to the person speaking. Shotgun mics (even expensive ones) may not give you the level of rejection you need. This article will give you some insight.
Remember sound follows the inverse square law: other things being equal, doubling (or halving) the distance = 6dB difference. Using omnidirectional lavalier mics, if one person's mic is 4 times as far from the other person, that should be 12 dB difference.
That said, in my experience speech recognition software is designed not very sensitive to level differences. This may turn into a problem for you and you may need a backup plan to get more separation, such as a headword mic. This way the mic will stay very close to the person's mouth and gain a lot of advantage in separation.
Hope this helps!