Some initial thoughts for you...
Firstly you need to determine whether you wish to report against uptime or downtime availability, as this will drive which is the best calculation method for you to use.
You essentially have 3 options for availability reporting:
The first option (DOWNTIME) requires that you determine your resilience capabilities e.g. that you offer single server, active:passive, active:active, multiple active – each of these will drive an availability capability. You would need to do this for each type of component. This is probably the preferred method, but takes time to set up, and also some strategic decisions around what components are on the critical path to deliver service. The theory behind this says for example that rather than 99.5% OLA, where a service uses an active:active configuration, that capability is actually closer to 99.75% and so on. You will also need to consider the capabilities within virtual as opposed to physical delivery. Once you have completed your modelling, you can then continue to use your current calculation method using whatever monitoring processes you use today.
The second option (UPTIME) still assesses according to resilience but starts from the top down. So you would need to determine your patterns of capability. These would turn in to infrastructure -, database -, application -, and web service patterns that could be utilised by any service. Your availability reporting would then be a case of monitoring against pattern capability. All IT services would then need to be designed utilising these patterns. For example, you may have a service that requires top notch infrastructure capability, standard database, standard application and top notch web services; the capability associated with these may be 99.999*99*99*99.99. In reality you would design in monitoring that would document MTBF and uptime capability of service. This option is much easier to automate but does have the constraint that it does not take into account customer perception of service.
The final option (DOWNTIME) is purely to utilise your Incident Management Logging Tool – to ensure that all Severity 1 & 2 Incidents have associated downtime and cause statements. This is a much more raw method of calculation but can take into account customer perception of service (i.e. you can record both the actual downtime compared to when the Incident was logged and then approved for closure by the Customer).
Availability is a complex area and how you progress will depend almost entirely on your strategic vision for defining and reporting against service. I wish you success!!