OBS Group Inc. - Notice history

OBS Serenity Workspace - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS ONE Alpha - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS Official Webspace - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 99.89%
Oct 2024
Nov 2024
Dec 2024

OBS HSZ A HYD - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS HSZ B BLR - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS HSZ C BOM - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS HSZ D LON - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 99.51%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS HSZ E NYC - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS HSZ F DEL/GN - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS HSZ EU LUXEMBOURG - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS TestX - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS Dream Centre - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS Pay - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS Pay APIs - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS MIRD Research Cloud - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 99.85%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS Dreamer ID Services - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 99.51%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS MIRD Research Cloud APIs - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 99.85%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS Global CDN - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS MediaNeXT - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS Press Self-Publish APIs - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS Anugraha Alpha - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS Anugraha Sevice APIs - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS SecuRA ZTE - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS SecuRA Runtime Environment - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS SecuRA Encryption APIs - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

OBS Security Motion APIs - Operational

100% - uptime
Oct 2024 · 100.0%Nov · 100.0%Dec · 100.0%
Oct 2024
Nov 2024
Dec 2024

Notice history

Dec 2024

Nov 2024

Global Service Disruption: Partial Outages in OBS MIRD Research Cloud & APIs, Major Outages in OBS HyperScalar Zone D and Dreamer ID Services
  • Resolved
    Resolved

    OBS Group Inc. experienced a global service disruption affecting multiple key platforms due to misconfigured cloud scaling resources. The incident impacted the following services:

    1. OBS MIRD Research Cloud & APIs (Partial outage across Europe, US, and India)

    2. OBS HyperScalar Zone D (Major global outage in London data center)

    3. OBS Dreamer ID Services (Global outage of authentication systems)

    The issues have been successfully resolved, and all services have been restored to normal operation.


    Resolution Actions Taken

    Immediate Remediation

    1. Scaling Configuration Adjustment

      • Corrected misconfigured auto-scaling parameters across the affected platforms to restore proper resource allocation.

    2. Service Restarts

      • Restarted critical components in OBS HyperScalar Zone D and Dreamer ID authentication services to resume functionality.

    3. Performance Validation

      • Conducted comprehensive tests to ensure all services were operating as expected without residual issues.

    Monitoring and Stabilization

    • Deployed enhanced monitoring tools to track resource usage and performance metrics in real-time.

    • Implemented temporary safeguards to prevent similar scaling misconfigurations while a long-term solution is developed.


    Root Cause Summary

    The incident was caused by a misconfiguration in cloud scaling policies, which:

    1. Limited the system’s ability to allocate additional resources during peak demand.

    2. Propagated resource shortages across dependent systems, leading to widespread outages and degraded performance.


    Impact Overview

    Service Impact Resolution OBS MIRD Research Cloud Partial outages and performance degradation across Europe, US, and India. Restored resource allocation. OBS MIRD APIs Latency and partial failures. Corrected scaling and validated APIs. HyperScalar Zone D Complete outage in London Zone D, affecting hosted workloads. Restarted services after scaling fix. Dreamer ID Services Global authentication failure, locking users out. Resolved configuration; resumed access.


    Conclusion

    OBS Group Inc. has fully resolved the service disruptions caused by the misconfigured cloud scaling resources. The organization has implemented immediate fixes and initiated long-term improvements to ensure service reliability and prevent similar issues in the future.

    OBS Group Inc. apologizes for the inconvenience caused and appreciates the patience and understanding of our customers during this incident.


  • Identified
    Identified

    Root Cause Analysis

    Cause

    • A misconfiguration in the cloud auto-scaling settings inadvertently limited resource allocation, preventing the system from scaling to meet demand during peak usage.

    • The issue propagated through dependent systems, leading to widespread disruptions.

    Key Contributing Factors

    1. MIRD Research Cloud and APIs

      • Insufficient compute and storage scaling in the affected regions (Europe, US, and India) caused performance degradation and partial outages.

    2. HyperScalar Zone D (London)

      • The scaling misconfiguration resulted in resource starvation, triggering a complete outage in Zone D.

    3. Dreamer ID Services

      • Due to automatic log-outs from authenticated devices lead to High authentication traffic which overwhelmed the misconfigured system, leading to a total global outage.


    Impact Assessment

    Business Impact

    • Interrupted access to OBS services for research institutions, enterprises, and global customers.

    • Affected user workflows, leading to potential financial losses for clients reliant on hosted services.

    • Damage to OBS Group Inc.'s reputation and customer trust.

    User Impact

    • MIRD Research Cloud & APIs: Delayed or failed operations in data-intensive tasks across Europe, US, and India.

    • HyperScalar Zone D: Complete downtime for workloads hosted in Zone D, affecting global operations.

    • Dreamer ID Services: Inability to authenticate, locking users out of multiple OBS services worldwide.


    Resolution Steps

    Immediate Actions

    1. Identified the misconfigured auto-scaling parameters in the affected services.

    2. Adjusted scaling thresholds to allow for proper allocation of resources during high demand.

    3. Restarted critical services in OBS HyperScalar Zone D and Dreamer ID authentication systems.

    4. Conducted validation tests to ensure stability and restored access.

    Long-Term Mitigation Measures

    1. Implement automated monitoring and alerts for misconfigured scaling policies.

    2. Conduct a comprehensive review of all cloud scaling configurations across OBS services.

    3. Enhance load-testing protocols to simulate peak demand scenarios.

    4. Provide additional training for engineering teams on cloud scaling best practices.

  • Investigating
    Investigating

    A series of service disruptions and outages occurred across several OBS Group Inc. platforms, impacting users globally. The incidents are currently under investigation, and updates will be provided as they become available. Below is a summary of the affected services and regions:

    1. OBS MIRD Research Cloud

      • Impact: Partial outage.

      • Affected Regions: Europe, the United States, and India.

      • Description: Users in the mentioned regions experienced degraded performance and intermittent access issues.

    2. OBS MIRD Research Cloud APIs

      • Impact: Partial outage.

      • Affected Regions: Europe, the United States, and India.

      • Description: APIs associated with MIRD Research Cloud are reporting latency issues and partial failures, impacting integrations with external systems.

    3. OBS HyperScalar Zone D (London)

      • Impact: Major global outage.

      • Affected Region: Zone D (London).

      • Description: All services in this zone are offline, resulting in significant disruptions for hosted workloads and applications.

    4. OBS Dreamer ID Services

      • Impact: Major global outage.

      • Affected Region: Global.

      • Description: Dreamer ID authentication services are entirely unavailable, causing login and access issues for users worldwide.


    Initial Investigation

    1. Observations

      • Service degradation in OBS MIRD Research Cloud and APIs began at 10PM -13:30UTC.

      • The global outage of OBS HyperScalar Zone D (London) and Dreamer ID Services was detected shortly after, at approximately 2 AM -13:30UTC.

    2. Potential Causes

      • Network connectivity issues across multiple regions.

      • Possible hardware failure or power issues at the London data center (Zone D).

      • A systemic failure in Dreamer ID's authentication infrastructure.

      • Dependencies between services may have propagated the impact.

    3. Current Status

      • Teams are actively investigating root causes for each impacted service.

      • Mitigation steps are being planned and implemented for services where partial functionality can be restored.


    Impact Assessment

    • Business Impact

      • Disrupted access to key research and cloud services, affecting academic and enterprise users.

      • Global inability to log into services using Dreamer ID.

      • Critical operations hosted in Zone D are offline, leading to delays and potential financial loss for clients.

    • User Impact

      • Limited or no access to cloud-based resources.

      • Interruptions in data processing and collaboration workflows.

      • Complete loss of authentication functionality, preventing access to all dependent services.


    Next Steps

    1. Ongoing Investigation

      • Incident response teams are analyzing logs and conducting a root cause analysis.

      • Collaboration with regional data centers and network providers is underway.

    2. Service Restoration

      • Teams are prioritizing partial recovery for the MIRD Research Cloud and APIs.

      • HyperScalar Zone D is under review for potential hardware fixes or system restarts.

      • Engineering teams are working to restore Dreamer ID Services globally.

    3. Communications

      • Regular updates will be shared with affected users and stakeholders.

      • An incident retrospective will be conducted after service restoration to identify and implement preventative measures.


    Contacts

    • Incident Manager: James Dupont

    • Technical Lead: Sarthak Videet

    • Customer Support: support@obsgroup.tech


    Final Note

    OBS Group Inc. is committed to resolving these issues as quickly as possible. We apologize for the inconvenience caused and appreciate your patience during this time.

Oct 2024 to Dec 2024

Next