What you are suggesting is similar to creating photo mosaics from aerial photography taken for photogrammetric mapping where an aircraft flies in a straight line taking photographs at an interval to give a 60% overlap (so there is a 10% overlap between every other photograph).
The photo mosaic was created by "fitting" common points in each overlap.
However, this technique was based on the assumption that the ground was relatively flat (i.e. changes in elevation were small when compared with the flying height) so that the perspective for each photograph can be ignored and I feel that this is where attempting this in the way you suggest will unfortunately fail.
To give an idea of what I am trying to describe:
Say the subject was two parallel lines of poles parallel to your camera positions.
If you "fitted" the photographs using the line of poles nearest the line of your camera positions then each of the poles in the line away from the line of your camera positions will appear both to the left and the right of the poles nearest the camera, so somehow you would need to remove the perspective from each image to create an orthographic image (an image where the subject is at right angles to the image plane rather than radiating from the camera).
To get the solution to work you would need to either find a way of changing the perspective image created by the camera into an orthographic image or chose a subject with very little difference in distance from the camera (e.g. a row of shop fronts) and ignore the perspective.
You would then need to "fit" the images manually or use a package designed for "fitting" multiple scans together (merge documents), such as PanoramaStudio (http://www.tshsoft.de/en/index.html)
I'll have a look at those other apps and see how I get on.
I hadn't considered aerial mosaics and I guess in the scheme of things, these are going to be even more complex to accurately join. You take things like Google Maps et al for granted so thanks for bringing that subject up.