About hiteshgondalia

Database Technology evangelist

How the Oracle OCI Proactively protect the customer workload in cloud from day one?

Current Challenge

Customer Engineer responsible for provisioning cloud resources might not be aware or not well trained for how to use the best security configuration as part of their cloud implementation. If the security is not follow during the starting phase then it always become very difficult to address later during the the cloud go live and often it became the reactive approach. The proactive approach to address the security from starting were missing in many CSP providers.

Oracle is helping to shift more of the security responsibilities from the customer to the cloud provider. 

Oracle OCI Gen2 Cloud built from ground level Built In Security Always On with Zero Trust Security Model

Oracle Security Zones

A service that helps ensures customers implement Oracle’s best practices for security by enforcing them from the start and removing the chance of configuration drift or someone violating them later. This brings clarity regarding what is needed to meet their security needs and removes guesswork from the equation when it comes to implementation.

Security Zones let you be confident that your resources in Oracle Cloud Infrastructure, including Compute, Networking, Object Storage, and Database resources, comply with Oracle security principles.

Access the Security Zone in OCI

Security zone An association between a compartment and a security zone recipe. Resource operations in a security zone are validated against all policies in the recipe.                                          

Security zone recipe A collection of security zone policies.

Security zone policy A security requirement for resources in a security zone.

When you create and update resources in a security zone, OCI validates these operations against the list of policies defined in the security zone recipe.

High Level Proposed Architecture

Creating Security Zone

Your tenancy has a predefined recipe named “Maximum Security Recipe”, which includes all available security zone policies. Oracle manages this recipe and you can’t modify it.

In general, security zone policies align with these security principles:

  1. Resources can’t be moved from a security zone to a standard compartment because it might be less secure
  2. Data in a security zone can’t be copied to a standard compartment because it might be less secure.
  3. All the required components for a resource in a security zone must also be located in a security zone. Resources that are not in a security zone might be vulnerable. For example, a compute instance in a security zone can’t use a boot volume that is not in a security zone.
  4. Resources in a security zone must not be accessible from the public internet.
  5. Resources in a security zone must be encrypted using customer-managed keys.
  6. Resources in a security zone must be regularly and automatically backed up.
  7. Resources in a security zone must use only configurations and templates approved by Oracle.

A security zone policy differs from an IAM policy in the following ways:

  • Administrators create IAM policies to grant users the ability to manage certain resources in a compartment.
  • A security zone policy ensures that these management operations comply with the Oracle maximum security architecture and best practices.
  • A security zone policy is validated regardless of which user is performing the operation.
  • A security zone policy denies certain actions; it doesn’t grant capabilities.
  • Administrators can’t create, modify, or disable security zone policies.

Verify the Security Zone

  1. You can’t create the bucket without customer managed keys.

It is suggesting you to follow the workflow to create secure bucket.

2. You can’t create the public bucket in security zone.

3. You can’t move the bucket from security zone to standard compartment.

4. You can’t add Internet Gateway in Security zone

Reference Architecture

OCI Documentation

Thank you for visiting this blog.

Disclaimer : The views expressed on this blog are my own and do not reflect the views of the companies I work, The opinions give by visitors on this site are there own opinions.

Keep your Cloud operation Cost lower with Oracle Bastion Service

Organization Challenges

We all know that customer infrastructure should not be publicly accessible to the Internet… But at the same time, Any operator who is an authenticated operator are regulated operator, should be able to access that infrastructure.

Today. The customers who have their target resources in Cloud, There are certain the customers are forced to use certain access networking patterns because of the absence of a native accessors !!! So either customers go for in a private subnet or they launched a jump box in the public subnet…

Then they have to muck with the security rules, routing rules, and stuff like that. And also they have to add the public SSH keys onto that jump box for the operators to jump to that Jump Box into their target resources.

The disadvantages of these excess networking patterns is that,

  1. The connections that are established are persistent, which definitely decrease your security posture because back surface is open for a long period of time.
  2. The operational overhead to harden these jump boxes, patching them periodically, and also at the same time, taking care of availability for your operators to get into your mission critical workloads. All of that overhead is on the customers.
  3. Also, such architectures work if you have a couple of resources here and there. But as your organization scales up, these architectures become very difficult to maintain. And there is always a risk of a security loophole. Security should be easy.
  4. This jump boxes, they are running 24×7 so definitely customers have to pay a cost for, for running these jump boxes.
  5. There is no auditability. So you don’t know as a customer who got into which target resource.
  6. Even with the best efforts that the customers put in, this whole architecture is not controlled through IAM. So whoever has, whichever operator has that, SSH keys onto the Jump Box, again forever access your target resources. So the overall story of the life-cycle management of who can access your target resources. It’s very difficult to maintain.

Oracle Cloud Infrastructure Bastion Service

So to solve these issues, Oracle have created the fully managed service which is the OCI Bastion service, which will help you in improving the security posture of your resources in OCI by providing secure as well as an ephemeral access to your private target resources. But then you will receive the services free of cost. This is a very core infrastructure security blood. You don’t have to choose between cost and security.

The access to the target resources via OCI Bastions is time-bound which definitely helps in increasing your overall security posture. And also the access is governed by the OCI IAM policies so only the users who have the right IAM policies can access your target resources. And once they leave the organization, all you have to do is you have to just remove those users from your groups in from your IAM groups and you’re done. You don’t have to do anything beyond that.

You can also restrict the incoming SSH connections to certain IPv4 address ranges, the administrative actions, like who/when created/deleted/updated/fetched bastion and session are recorded in OCI event and audit service and also in the Cloud Guard.

The end-users on their on-premises laptops or desktops or workstations can basically use any open SSH client, they can access the Bastion Service as a pass through to get into their target resources.

Use Cases

OCI Bastion product is built on top of OpenSSH/SSH so whatever is possible to OpenSSH is possible to this service.

Types of target resources which are going to be supported by OCI Bastion would be:

Private target compute host running either native OCI images or customer Linux Images and Windows OS.

Autonomous transaction processing, Autonomous data warehouse, MySQL DB, OKE instances. We also support communities.

You can manage the bastion and sessions that are created via the service. So basically what that means is, at any point of time, if you feel if you see that assertion has gone malicious, or let’s say you see that you are under attack. You can simply delete the sessions. You can pick out those malicious users. You can delete the whole bastion and to protect your particular target resources. So you have all of those capabilities.

Once the session is created, customers can use the session metadata to tunnel into the target resources via bastion from their on-premises terminals.

You can use OCI bastion to access your private target resources in OCI irrespective of whether the target resources has the Oracle Cloud Agent installed or not.

The session type depends on the target host.

Managed SSH sessions can only be created for a target host that is a Compute instance configured to run both the Oracle Cloud Agent and an OpenSSH server.

SSH port forwarding sessions do not require a running Oracle Cloud Agent or OpenSSH server on the target host, and can be used with resources like Autonomous Transaction Processing databases.

High Level Architecture

HOW TO USE OCI BASTION?

Go to the Identity & Security and choose the Bastion

Create Compute Resource

Create Bastion Resource

Create Managed SSH Session

Create Session Port Forwarding

Access Window Server using Session Port Frwd

Reference architecture

OCI Documentation

Thank you for visiting this blog.

Disclaimer : The views expressed on this blog are my own and do not reflect the views of the companies I work, The opinions give by visitors on this site are there own opinions.

How to protect your application servers or containers in Cloud?

Most of the industry have compliance and regulations from Government or Standard authority that very from industry to industry for maintain the security standard as part of their application services delivery to the end user and so the Security is the crucial decision factor before go live the application in most of the industry today.

I have seen the customers to validate the security issues in their app server image and application packages or build before they go live for production, Traditionally customer relaying on security scanning tools like Nessus, OpenVAS, OpenSCAP, Nmap, Wireshark, Metasploit, but the real challenges for customer start when they have large scale deployment and overhead to keep them operating to meet the Security SLA. Also With cloud world the more and more responsibility raised on the customer head.

Vulnerability scanning is a common compliance requirement (e.g., NIST 800-53 Rev.4 FISMA) for customers and a recommended security best practice for all organizations.

Challenges

Customers face challenges with scanning due to:

1. Disjointed vulnerability scanning tools— often, customers will buy or license multiple tools for scanning instances, containers, and applications. The total cost can add up, leaving customers to choose between cost and security.

2. Lots of manual processes to correct vulnerabilities—Customers must deploy, configure, and upgrade agents on their fleets, with large operational pain, and the potential for misconfiguration due to human error.

3. Large volume of alerts with a high false positive rate—Vulnerability reports can overwhelm customers with “noise”. Too many false positive findings will cause customers to get lost in the volume or get accustomed to it. As a result, this can reduce the time to resolution for critical issues or even worse, these critical issues can go unacknowledged.

Vulnerability Scanning Service

Oracle Cloud Infrastructure Vulnerability Scanning Service (OCI VSS) is simple, prescriptive, and tightly integrated with the OCI platform. VSS is available to all OCI customers that have paid accounts at no additional cost. The scanning platform includes default plugins and engines for instance and container scanning.

The Scanning service can identify several types of security issues in your compute instances :

  • Ports that are unintentionally left open might be a potential attack vector to your cloud resources, or enable hackers to exploit other vulnerabilities.
  • OS packages that require updates and patches to address vulnerabilities
  • OS configurations that hackers might exploit
  • Industry-standard benchmarks published by the Center for Internet Security (CIS).

The Scanning service checks hosts for compliance with the section 5 (Access, Authentication, and Authorization) benchmarks defined for Distribution Independent Linux.

The Scanning service can scan individual compute instances, or it can scan all compute instances within a compartment and its subcompartments. If you configure the Scanning service at the root compartment, then all compute instances in the entire tenancy are scanned.

The Scanning service detects vulnerabilities in the following platforms:

  1. Oracle Linux
  2. CentOS
  3. Ubuntu
  4. Windows (no CIS benchmarks)

Oracle Vulnerability Scanning Service helps improve your security posture in Oracle Cloud by routinely checking hosts for potential vulnerabilities. The service generates reports with metrics and details about these vulnerabilities

High Level Architecture

Key Service Concept

Scan Recipe

Scanning parameters for a type of cloud resource, including what information to examine and how often.

Target

One or more cloud resources that you want to scan using a specific recipe. Resources in a target are of the same type, such as compute instances.

Host Scan

Metrics about a specific cloud resource that was scanned, including the vulnerabilities that were found, their risk levels, and CIS benchmark compliance. The Scanning service uses a host agent to detect these vulnerabilities

Port Scan

Open ports that were detected on a specific cloud resource that was scanned. The Scanning service can detect open ports using a host agent, or using a network mapper that searches your public IP addresses

Vulnerabilities Report

Information about a specific type of vulnerability that was detected in one or more targets, like a missing update for an OS package.

Integration with Cloud Guard

You can view security vulnerabilities identified by the Scanning service in Cloud Guard. Cloud Guard alerting can help customers reduce the time from detection to remediation.

Access the Service from OCI Console

Configure the VSS for your tenancy or specific compartment

Create Compute Resource

Result & Remediate

Reference Resources

Whitepaper

Reference architecture

OCI Documentation

Thank you for visiting this blog.

Disclaimer : The views expressed on this blog are my own and do not reflect the views of the companies I work, The opinions give by visitors on this site are there own opinions.

Quick Start for OCI Vault

Historically customer store master encryption keys and secrets in Server configuration files or in code. As we all know that “Data Is the New Oil of the Digital Economy”. In cloud world customer have choice to choose the best option to secure their data and that is why cloud is more secure platform than on-premises.

In this article we will focus on the overview of service – OCI Vault, the types of offering based on the use case, key capabilities and how to use the Vault with various OCI services.

The Vault service helps you centrally manage the encryption keys that protect your data and the secret credentials that you use to access resources. Vaults securely store master encryption keys and secrets that you might otherwise store in configuration files or in code.

It lets you to centrally manage and control use of keys and secrets across a wide range of OCI services and applications. OCI Vault is a secure, resilient managed service that lets you focus on your data encryption needs without worrying about time-consuming administrative tasks such as hardware provisioning, software patching, and high availability.

Key Management uses hardware security modules (HSM) that meet Federal Information Processing Standards (FIPS) 140-2 Security Level 3 security certification, to protect your keys. You can create master encryption keys protected either by HSM or software. With the HSM- protected keys, all the cryptographic operations and storage of keys are inside the HSM. With the software-protected keys, your encryption keys are stored and processed in software, but are secured at rest with a root key from HSM.

The following key management capabilities are available when you use the Vault service.

  • Create your own encryption keys that protects your data
  • Bring your own keys
  • Rotate your keys
  • Support for cross-region backup and restore for your Keys
  • Constrain permissions on keys using IAM policies
  • Integration to OCI internal services: Oracle Autonomous Database, Exadata Databases (without Oracle Data Guard enabled),Oracle Block Storage, Oracle File storage, Oracle Object Storage, Streaming and Container engine for Kubernetes

High Level Vault Service Integration Architecture

Get Started with Vault

1. Ensure that the limits for your tenancy allow for creation of the Vault type you intend to create.

2. Ensure that Oracle Identity and Access Management (IAM) policies have been created for the user account to have the necessary permissions to create a Vault. See IAM Policy Reference to construct a statement.

3. You first create a Vault by selecting Security from the Oracle Cloud Infrastructure Console, and then Vault.

Create a Vault and select from one of the two available Vault types that best fits your isolation and processing requirements:

  1. Virtual Private Vault: Chose a Virtual Private Vault if you require increased isolation on the HSM and dedicated processing of encrypt/decrypt operations.
  2. Vault (Default): Choose the default Vault if you are willing to accept a moderate isolation (multitenant partition in HSM) and shared processing for encrypt/decrypt operations.

4. Create the [Master Encryption] Key(s) inside your Vault. Master encryption keys can have one of two protection modes: HSM or software.

  • A master encryption key protected by an HSM is stored on an HSM and cannot be exported from the HSM. All cryptographic operations involving the key also happen on the HSM.
  • A master encryption key protected by software is stored on a server and can be exported from the server to perform cryptographic operations on the client instead of on the server. While at rest, the software-protected key is encrypted by a root key on the HSM.

5. Ensure that IAM policies for the service or entity calling Vault has the necessary permissions.

Example: allow service objectstorage-us-ashburn-1 to use keys in compartment

Use the key(s):

  • With native Oracle Cloud Infrastructure storage: When creating storage (bucket, file, volume), mark with “ENCRYPT USING CUSTOMER-MANAGED KEYS”, then select the Vault and the Master Encryption Key. Data in that bucket/volume/file storage will be encrypted with a data encryption key wrapped with the Master Encryption Key in Vault.
  • With crypto operations, using Command Line Interface (CLI) as an example: oci kms crypto encrypt –key-id –plaintext
  • Crypto operations are available in SDK and API as well. For more details, see Overview of Vault in the documentation.

6. Monitor your usage of operations with metrics in the console and Monitoring service. See the metrics and dimensions

Using Keys

You can directly submit data to Key Management APIs to encrypt and decrypt using your master encryption keys stored in the Vault.

Also, you can encrypt your data locally within your applications and OCI services using a method known as Envelope encryption.

With envelope encryption, you generate and retrieve Data Encryption Keys (DEK) from Key Management APIs. DEKs are not stored or managed in Key Management service but are encrypted by your Master Encryption Key. Your applications can use DEK to encrypt your data and store the encrypted DEK along with the data. When your applications want to decrypt the data, you should call decrypt to Key Management API on the encrypted DEK to retrieve the DEK. You can the decrypt your data locally with the DEK.

Key Management supports sending up to 4 KB of data to be encrypted directly. In addition, envelope encryption can offer significant performance benefits. When you encrypt data directly with Key Management APIs, it must be transferred over the network. Envelope encryption reduces the network load since only the request and delivery of the much smaller DEK go over the network. The DEK is used locally in your application or encrypting OCI service, avoiding the need to send the entire block of data.

OCI Offer two choice of Encryption for customer while provisioning the resources

Oracle Managed is the default encryption for many OCI services. Oracle Managed means data will be encrypted at rest with an encryption key whose lifecycle management is controlled by Oracle. Customers who don’t want to manage or access their encryption keys and are looking for an easiest way to protect all their data stored in OCI can choose Oracle Managed encryption.

Customer-Managed encryption is offered by OCI Vault—Key Management service where the customer controls and manages the keys that protect their data. In addition, customers who require elevated security and FIPS 140-2 Level 3 protection to meet compliance choose Customer Managed as the encryption keys are stored in hardware security modules (HSMs).

Create Resource with OCI Vault

For more information, see OCI Documentation 

Reference OCI Vault FAQ

Thank you for visiting this blog.

Disclaimer : The views expressed on this blog are my own and do not reflect the views of the companies I work, The opinions give by visitors on this site are there own opinions.

OCI DRG functionality expanded in Oracle Cloud

DRG functionality has been expanded to include the following capabilities:

  • You can attach a DRG to more than one VCN to provide inter-VCN network connectivity. VCNs can be in the same or different tenancies. 
  • You can now assign a different route table and policy to each network resource attached to your DRG enabling granular routing control.  For instance, by connecting all your VCNs and on-premises networks to a single DRG used as a “Hub,” you have a single central gateway to configure traffic routing and Layer 3 isolation.  One possible use case of routing policy is directing all traffic passing thru the DRG to a network virtual appliance or firewall.
  • Your on-premises network connected to a DRG in one region can access networks connected to a DRG in a different region using a remote peering connection (RPC).
  • You can now enable equal cost multi-path (ECMP) routing towards your IPSec VPN and FastConnect connections to support active-active scenarios. ECMP is controlled on a per route table basis.
  • Remote peering connections can now connect DRGs in the same region or different tenancies.

Use case demonstration in Oracle Blog

Introducing global connectivity and enhanced cloud networking with the dynamic routing gateway

Latest OCI Release Notes update

OCI Networking Release Notes

Exadata Storage expansion

I got a chance to explore and involve in this DBMA Task. In this article, I will summarize and walk through a procedure about adding a new cell to an existing Exadata Database Machine.

Most of us knew the capabilities that Exadata Database Machine delivers. It’s a known fact that Exadata comes in different fixed rack size capacity:

    • 1/8 rack (2 db nodes, 3 cells),
    • quarter rack (2 db nodes, 3 cells),
    • half rack (4 db nodes, 7 cells) and
    • full rack (8 db nodes, 14 cells). 

When you want to expand the capacity, it must be in fixed size as well, like, 1/8 to quarter, quarter to
half and half to full.

 

With Exadata X5 Elastic configuration, one can also have customized sizing by extending capacity of the rack
by adding any number of DB servers or storage servers or combination of both, up to the maximum allowed capacity
in the rack.

Preparing to Extend Exadata Database Machine

Preparing to Extend Exadata Database Machine ◄===

[0] Validate the environment
Before starting the activity, collect the Exachk, and validate the environment.
Also, verify the current cell alert if any.

dcli -g /root/cell_group -l root "cellcli -e list alerthistory where endTime=null and alertShortName=Hardware and alertType=stateful and severity=critical"

[1] Ensure HW placed in the rack, and all necessary network and cabling requirements are completed.
(2 IPs from the management network is required for the new cell).

[2] Re-image or upgrade of cell image

2.1 Extract the imageinfo from one of the existing cell server.
2.2 Login to the new cell through ILOM, connect to the console as root user and get the imageinfo
2.3 If the image version on the new cell doesn’t match with the existing image version, either you
download the exact image version and re-image the new cell or upgrade the image on the existing servers.

Review “MOS Doc ID 2151671.1” if you want to reimage the new cell.

[3] Add the IP addresses acquired for the new cell to the /etc/oracle/cell/network-config/cellip.ora file on each DB node.

To do this, perform the steps below from the first 1 dB server in the cluster:

cd /etc/oracle/cell/network-config
cp cellip.ora cellip.ora.orig
cp cellip.ora cellip.ora-bak

[4] If ASR alerting was set up on the existing storage cells, configure cell ASR alerting for the cell being added

List the cell attributes required for configuring cell ASR alerting.
Run the following command from any existing storage grid cell:

CellCLI> list cell attributes snmpsubscriber

Apply the same SNMP values to the new cell by running the command below as the celladmin user,
as shown in the below example:

CellCLI> alter cell snmpSubscriber=((host='10.20.14.21',port=162,community=public))

[5] Configure cell alerting for the cell being added.

List the cell attributes required for configuring cell alerting.
Run the following command from any existing storage grid cell:

CellCLI> list cell attributes notificationMethod,notificationPolicy,
smtpToAddr,smtpFrom,smtpFromAddr,smtpServer,smtpUseSSL,smtpPort

Apply the same values to the new cell by running the command below as the celladmin user,
as shown in the example below:

CellCLI> alter cell notificationmethod='mail,snmp',notificationpolicy='critical,warning,clear',
smtptoaddr= 'dba@email.com',smtpfrom='Exadata',smtpfromaddr='dba@email.com',smtpserver='10.20.14.21',
smtpusessl=FALSE,smtpport=25

[6] Create cell disks on the cell being added

Log in to the cell as celladmin and run the following command:

CellCLI> create celldisk all

[7] Check that the flash log was created by default:

CellCLI> list flashlog

You should see the name of the flash log. It should look like cellnodename_FLASHLOG, and its status should be “normal”.If the flash log does not exist, create it using :

CellCLI> create flashlog all

[8] Check the current flash cache mode and compare it to the flash cache mode on existing cells:

CellCLI> list cell attributes flashcachemode

To change the flash cache mode to match the flash cache mode of existing cells, do the following:

1. If the flash cache exists and the cell is in WriteBack flash cache mode,
you must first flush the flash cache:

CellCLI> alter flashcache all flush

Wait for the command to return.

2. Drop the flash cache:

CellCLI> "drop flashcache all"

3. Change the flash cache mode:

CellCLI> alter cell flashCacheMode=writeback

The value of the flashCacheMode attribute is either writeback or writethrough.
The value must match the flash cache mode of the other storage cells in the cluster.

4. Create the flash cache:

CellCLI> create flashcache all

[9] Create grid disks on the cell being added.

—> Query the size and cachingpolicy of the existing grid disks from an existing cell.

CellCLI> list griddisk attributes name,asmDiskGroupName,cachingpolicy,size,offset
  • For each disk group found by the above command, create grid disks on the new cell that is being added to the cluster.
  • Match the size and the cachingpolicy of the existing grid disks for the disk group reported by the command above.
  • Grid disks should be created in the order of increasing offset to ensure similar layout and performance characteristics as the existing cells.
  • For example, the “list griddisk” command could return something like
    this:
DATAC1 default 5.6953125T 32M
DBFS_DG default 33.796875G 7.1192474365234375T
RECOC1 none 1.42388916015625T 5.6953582763671875T

When creating grid disks, begin with DATAC1, then RECOC1, and finally DBFS_DG using the following command:

CellCLI> create griddisk ALL HARDDISK PREFIX=DATAC1, size=5.6953125T, cachingpolicy='default',
comment="Cluster cluster-clux6 DR diskgroup DATAC1"

CellCLI> create griddisk ALL HARDDISK PREFIX=RECOC1,size=1.42388916015625T, cachingpolicy='none',
comment="Cluster cluster-clux6 DR diskgroup RECOC1"

CellCLI> create griddisk ALL HARDDISK PREFIX=DBFS_DG,size=33.796875G, cachingpolicy='default',
comment="Cluster cluster-clux6 DR diskgroup DBFS_DG"

CAUTION: Be sure to specify the EXACT size shown along with the unit (either T or G).

[10] Verify the newly created grid disks are visible from the Oracle RAC nodes.
Log in to each Oracle RAC node and run the following command:

$GI_HOME/bin/kfod op=disks disks=all | grep cellName_being_added

This should list all the grid disks created as above.

[11] Add the newly created grid disks to the respective existing ASM disk groups.

ALTER DISKGROUP disk_group_nameadd disk 'comma_separated_disk_names';

The command above kicks off an ASM rebalance at the default power level.
Monitor the progress of the rebalance by querying gv$asm_operation :

SQL> select * from gv$asm_operation;

Once the rebalance completes, the addition of the cell to the Oracle RAC is complete.

[12] Run the latest Exachk to ensure that the resulting configuration implements the latest best practices for Oracle Exadata.

Thank you Oracle ACE Syed Jaffar Hussain for sharing his experience

Thank you for visiting this blog 🙂

Manually take an ILOM snapshot

DBMA has to collect the ILOM snapshot as per the request from oracle support, As many of you might be asked by Oracle support to provide ILOM snapshot to troubleshoot Exadata Hardware issues.

I had to diagnose a hardware issue recently and was not able to use web interface because for firewall issue. Fortunately, you can generate ILOM snapshot using following CLI method.

[1] let’s connect and set the snapshot type to normal

Step 1 : Login to ILOM using root user.

[root@myclusterdb01 ~]# ssh myclustercel05-ilom
Password:
Oracle(R) Integrated Lights Out Manager
Version 3.2.7.30.a r112904
Copyright (c) 2016, Oracle and/or its affiliates. All rights reserved.
Warning: HTTPS certificate is set to factory default.
Hostname: myclustercel05-ilom

Step 2 : Set snapshot dataset to normal.

-> set /SP/diag/snapshot dataset=normal
Set 'dataset' to 'normal'

Step 3 : Set snapshot output location.

-> set /SP/diag/snapshot dump_uri=sftp://root:"passowrd!"@10.21.101.22/tmp’

Set 'dump_uri' to 'sftp://root:"passowrd!"@10.21.101.22/tmp’

Step 4 : Change directory to snapshot

-> cd /SP/diag/snapshot
/SP/diag/snapshot

Step 5 : Check Status of snapshot , make sure its running

-> show
/SP/diag/snapshot

Targets:
Properties:
dataset = normal
dump_uri = (Cannot show property)
encrypt_output = false
result = Running

Step 6: Keep checking status till it’s completed. May take up to 10 mins

-> show
/SP/diag/snapshot
Targets:

Properties:
dataset = normal
dump_uri = (Cannot show property)
encrypt_output = false
result = Collecting data into

sftp://oracle@10.21.101.22/etc/snapshot/exa01dbadm01-ilom_XXXX30AG_2018-09-14T23-04-46.zip

TIMEOUT: /usr/local/bin/spshexec show /SP/bootlist
TIMEOUT: /usr/local/bin/create_ueficfg_xml

Snapshot Complete.
Done.

Step 7: Upload files to Oracle support.

oracle@10.21.101.22/tmp/exa01dbadm01-ilom_XXXX30AG_2018-09-14T23-04-46.zip

[2] let’s connect and set the snapshot type to full :

A full ILOM snapshot (which is the one Oracle support will most likely ask you) may (yes, “may”) reset the host as per the documentation :
Note – Using this option might reset the host operating system.
“Reset the host” meaning rebooting the host.

Fred mentioned in his blog that he did it few times on production cells and they have never been rebooted but this is something to keep in mind if you are asked to take a full ILOM snapshot of a database server. Indeed, a cell reboot would be transparent but this is a different story with a database server.

[root@myclusterdb01 ~]# ssh myclustercel05-ilom
Password:
Oracle(R) Integrated Lights Out Manager
Version 3.2.7.30.a r112904
Copyright (c) 2016, Oracle and/or its affiliates. All rights reserved.
Warning: HTTPS certificate is set to factory default.
Hostname: myclustercel05-ilom
-> set /SP/diag/snapshot dataset=full

Set 'dataset' to 'full'

->

Then start the ILOM snapshot using the IP of the target system we will put 
the ILOM on and its root password
(it'll copy the ILOM snapshot in /tmp in the below example) :

-> set /SP/diag/snapshot dump_uri=sftp://root:root_password@10.11.12.13/tmp
Collecting a "full" dataset may reset the host. Are you sure (y/n)? y
Set 'dump_uri' to 'sftp://root@10.11.12.13/tmp'

Now that the ILOM snapshot has been started, 
you can monitor it using the below command :

-> show /SP/diag/snapshot

/SP/diag/snapshot
Targets:

Properties:
dataset = full
dump_uri = (Cannot show property)
encrypt_output = false
result = Running

Commands:
cd
set
show

->

After few minutes you should see the ILOM snapshot as completed :

-> show /SP/diag/snapshot

/SP/diag/snapshot
Targets:
Properties:
dataset = full
dump_uri = (Cannot show property)
encrypt_output = false
result = Collecting data into sftp://root@10.11.12.13/tmp/myclustercel07-ilom_1133FMM02D_2018-02-04T23-18-06.zip
Snapshot Complete.
Done.

Commands:
cd
set
show

->

This is actually quite a small file easy to transfer to MOS :

[root@myclusterdb01 ~]# du -sh /tmp/myclustercel07-ilom_1133FMM02D_2018-02-04T23-18-06.zip
2.5M /tmp/myclustercel07-ilom_1133FMM02D_2018-02-04T23-18-06.zip
[root@myclusterdb01 ~]#

Thank you Oracle ACE Fred Denis for sharing his experience

Thank you for visiting this blog 🙂

Manually reboot a database server using its ILOM

The base on the requirement this will be a DBMA Task

We will be using its ILOM which is the administration console each Exadata component has. Be sure to have :
  • The database server ILOM IP (usually <dbserver-name>-ilom like <mycluster>db02-ilom)
  • The ILOM root’s password (in case of, the default password is welcome1)
[root@myclusterdb01 ~]# ssh myclusterdb01-ilom
Warning: Permanently added the RSA host key for IP address '10.191.84.24' to the                                                                            list of known hosts.
Password
Oracle(R) Integrated Lights Out Manager
Version 3.2.8.25 r114493
Copyright (c) 2016, Oracle and/or its affiliates. All rights reserved.
Warning: HTTPS certificate is set to factory default.
Hostname: myclusterdb04-ilom

-> reset /SYS 
Are you sure you want to reset /SYS (y/n)? y
Performing hard reset on /SYS
->

This would have started a hard reboot of the myclusterdb01 database server.

You can then connect to the console to have a look at what is happening (the server boot logs) :

-> start /sp/console
Are you sure you want to start /SP/console (y/n)? y
Serial console started.  To stop, type ESC (
. . .
[INFO] /usr/sbin/ipmitool user set name 4 iu_ngtmh
[INFO] /usr/sbin/ipmitool user set password 4 ********
[INFO] Executing: /usr/bin/mstflint -y -d /proc/bus/pci/40/00.0 -i /var/log/exadatatmp/firmware/ActualFirmwareFiles/fw-ConnectX3-rel-2_35_5532-15-7046442_7092757.bin  burn

    Current FW version on flash:  2.11.1280
    New FW version:               2.35.5532

[INFO] run /usr/sbin/ipmitool cmd to set /SP/users/iu_ngtmh/role=aucro
Burning FS2 FW image without signatures - 7[INFO] export IPMI_PASSWORD=********
[INFO] /usr/sbin/ipmiflash -v -I lanplus -H 10.191.84.24 -U iu_ngtmh -E write /var/log/exadatatmp/firmware/ActualFirmwareFiles/ILOM-3_2_10_22_a_r121452-Sun_Server_X4-2.pkg force script config delaybios warning=0
Burning FS2 FW image without signatures - OK
Restoring signature                     - OK
[INFO] Waiting for the service processor to finish firmware upgrade for up to 1200 seconds.
. . .

Give few minutes to the server to reboot and you’re done.

Please keep in mind that :
Unlike an Infiniband Switch, you do not have to use the spsh command to jump into the ILOM shell as you are using the dedicated ILOM IP address to connect to
Note that you have to use this weird ILOM syntax to quit the console : ESC and then “(”

Thank you Oracle ACE Fred Denis for sharing his experience

Thank you for visiting this blog 🙂

Manually reboot an Infiniband Switch

The base on the requirement this will be a DBMA Task

  • Do NOT reboot both Exadata Switches at the same time — you’ll get into lots of trouble
  • An IB Switch ILOM is embedded within the Switch itself and has to be accessed using the ILOM shell with the “spsh” command and then use the “reset /SP” command to reboot the Switch as shown below
# ssh myclustersw-ib3
# spsh
-> reset /SP
Are you sure you want to reset /SP (y/n)? y
[root@myclusterdb01 ~]# ssh myclustersw-ib3

Last login: Wed Dec 20 17:58:46 2017 from myclusterdb01.mydomain.com
You are now logged in to the root shell.
It is recommended to use ILOM shell instead of root shell.
All usage should be restricted to documented commands and documented
config files.
To view the list of documented commands, use "help" at linux prompt.

[root@myclustersw-ib3 ~]# spsh

Oracle(R) Integrated Lights Out Manager
Version ILOM 3.0 r47111
Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved.

->  reset /SYS
Are you sure you want to reset /SP (y/n)? y

Performing reset on /SP Broadcast message from root (Sun Jan 28 20:14:42 2018):
The system is going down for reboot NOW!
-> Connection to myclustersw-ib3 closed by remote host.
Connection to myclustersw-ib3 closed.

[root@myclusterdb01 ~]#

https://docs.oracle.com/cd/E19273-01/html/821-0243/gixyc.html

Thank you Oracle ACE Fred Denis for sharing his experience

Thank you for visiting this blog 🙂

Shut down or reboot an Exadata storage cell without affecting ASM

This article covers some of the DBMA Commands which will be useful while performing this scenario where we want to reboot the cell not due to some maintenance activity.

[1] Verify the existing disk_repair_time attribute for all diskgroups
SQL> select dg.name,a.value from 
v$asm_diskgroup dg, v$asm_attribute a 
where dg.group_number=a.group_number and
a.name='disk_repair_time';

[2] The default disk_repair_time is 3.6 hours only so better to adjust.
 SQL> ALTER DISKGROUP DATA SET ATTRIBUTE 'DISK_REPAIR_TIME'='8.5H';
[3] Next you will need to check if ASM will be OK if the grid disks go OFFLINE. 
The following command should return 'Yes' for the grid disks being listed:

cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

Execute the command below and the output should show either 
asmmodestatus=OFFLINE or asmmodestatus=UNUSED and 
asmdeactivationoutcome=Yes for all griddisks once the disks are offline in ASM. 
Only then is it safe to proceed with shutting down or restarting the cell:

Note: Shutting down the cell services when one or more grid disks does not return 
asmdeactivationoutcome='Yes' will cause Oracle ASM to dismount the affected disk group, 
causing the databases to shut down abruptly.

[4] Inactivate all grid disks on the cell you wish to power down/reboot:
cellcli -e alter griddisk all inactive

[5] Confirm that the griddisks are now offline by performing the following actions:
cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome
cellcli -e list griddisk

Note:
Execute the command below and the output should show either asmmodestatus=OFFLINE or 
asmmodestatus=UNUSED and asmdeactivationoutcome=Yes for all griddisks once the disks are 
offline in ASM. Only then is it safe to proceed with shutting down or restarting the cell

[6] You can now reboot the cell.
#shutdown -h now

[7] Once the cell comes back online - you will need to reactive the griddisks:

cellcli -e alter griddisk all active

[8] Issue the command below and all disks should show 'active':

cellcli -e list griddisk

[9] Verify grid disk status: 

cellcli -e list griddisk attributes name, asmmodestatus
cellcli -e list griddisk attributes name where asmdeactivationoutcome != 'Yes'

Below are the some of the good article related to this topics.

For detail information,please refer MOS DOC ID 1188080.1 Steps to shut down or reboot an Exadata storage cell without affecting ASM.

Thank you for visiting this blog 🙂