AI Model testing framework

Created a doc to help engineers assess LLMs and troubleshoot results.