The use of large language models (LLM) can promote bias in different LLMs downstream tasks with respect to specific issues and social groups. Addressing the question of bias in LLMs is an ongoing effort. Research focusses on understanding bias promotion, bias evaluation metrics, benchmark datasets, and mitigation techniques. To gain a deeper understanding of the anatomy of bias in LLM, we present a streamline strategy consisting of polarisation-based evaluation metrics paired with a synthetic dataset of events to measure bias in LLM based on sentiment score. As a case study, we apply state-of-the-art LLM (LLama-3, Mistral, GPT-4, Claude-3.5, Gemini-1.0) to perception of sentiment of events in the context of the Russo-Ukraine conflict. The dataset was constructed against the background of the conflict, and we used the CAMEO conflict notation to achieve symmetric event generation toward the conflicting parties, structured into 15 categories. We evaluated and visualised the bias between models, between different categories, and between the target groups. Our experiments confirmed the main hypothesis that LLMs perceive RU targets less positive than UA targets. However, some categories showed anomalous behaviour and there were also significant differences in bias between models.